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This  document  contains  the  edited  transcriptions  of  the  presentations  delivered 
at  the  Quality  of  Care  Research  Symposium,  sponsored  by  the  Health  Care 
Financing  Administration,  June  11-12,  1987,  at  the  Sheraton  Inner  Harbor  Hotel 
in  Baltimore,  Maryland.  Two  of  the  speakers,  Ann  Flood  and  Douglas  Wagner, 
substituted  formal  papers  in  place  of  their  transcribed  remarks. 

The  Symposium   covered  several   major  issues  relating  to  quality  of  care, 
v      including  the  uses  of  outcome  assessment,  the  strengths  and  weaknesses  of 
process  and  structural  measures  of  quality,  and  the  importance  of  feedback  in 
quality  of  care  activities. 

Many  people  in  the  Office  of  Research  and  Demonstrations  were  responsible  for 
the  success  of  this  Symposium.  Feather  Davis  coordinated  the  overall  planning 
and  development  of  the  symposium  as  well  as  the  preparation  of  these 
proceedings.  Sydney  Galloway  was  instrumental  in  the  management  of  this 
effort.  The  organization  and  planning  of  the  conference  was  under  the  direction 
of  Mike  Fitzmaurice,  Acting  Director,  Office  of  Research,  Marian  Gornick, 
Director,  Division  of  Beneficiary  Studies  and  Paul  Eggers,  Chief,  Program 
Evaluation  Branch.  The  Circle,  Inc.  was  responsible  for  arranging  the  facilities, 
travel,  production  of  the  transcriptions  and  other  operational  activities.  A 
special  commendation  is  due  to  Anne  Childress  who  was  responsible  for 
transcribing  the  tapes  and  writing  and  editing  this  document. 


Joseph  R.  Antos,  Director 

Office  of  Research  and  Demonstrations 

Health  Care  Financing  Administration 
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Opening  Remarks 
William  L.  Roper,  M.D. 
Health  Care  Financing  Administration 


I  am  pleased  and  honored  to  welcome  you  to  this  conference  on  quality  of  care 
measurement.  In  our  Nation  today,  there  is  a  commitment  to  ensuring  health 
care  financing  for  the  aged  and  the  poor.  Moreover,  we  in  the  Department  of 
Health  and  Human  Services  are  not  only  committed  to  maintaining  the  quality  of 
care  provided  to  Medicare  and  Medicaid  beneficiaries,  but  we  are  committed  to 
enhancing  that  level  of  quality.  Secretary  Bowen  has  placed  that  as  one  of  his 
highest  priorities.  Accomplishment  of  this  requires  more  knowledge  and 
understanding  of  the  basic  issues  relating  to  quality  of  care  measurement  and 
assurance.  This  symposium  is  part  of  our  agenda  to  further  develop  that 
knowledge. 

The  issues  surrounding  quality  of  care  —  how  to  measure  quality  of  care  and  how 
to  monitor  and  assure  quality  of  care  —  are  not  new  to  the  health  care 
community.  Rather,  they  date  back  many  centuries.  However,  I  don't  think 
there  has  ever  been  a  time  in  our  history  when  the  urgency  to  "do  it  right"  has 
ever  been  greater.  Last  June,  in  a  hearing  before  the  Senate  Finance 
Committee,  I  talked  about  the  issue  of  quality  and  said  that  we  would  be  holding 
a  conference  to  focus  on  the  question  of  quality  measurement.  We  held  that 
conference  in  December,  meeting  with  a  number  of  leaders  from  the  health 
industry  and  consumer  organizations,  and  talked  about  what  we  knew  then  and 
what  we  intended  to  do  with  the  information  we  had  at  hand.  In  particular,  we 
focused  on  the  question  of  hospital  mortality  statistics  and  our  belief  that  we 
need  to  improve  our  ability  to  measure  quality  through  the  use  of  mortality 
statistics.  We  are  determined  to  use  the  best  information  we  have  to  better 
inform  the  public  so  that  they  can  make  choices  about  their  health  care,  and  we 
intend  to  proceed  on  that  initiative  to  release  information  this  fall.  We  continue 
to  solicit  advice  and  assistance  because  this  release  of  information  is  not  a  single 
event,  but  rather  will  be  a  continuing,  and  we  believe,  an  improving  process  for 
quality  measurements  and  information  release. 

The  purpose  of  this  conference  is  to  step  back  from  the  immediacy  of  what  we're 
doing  in  order  to  focus  on  what  we  know  from  research  and  to  think  about  the 
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direction  research  needs  to  go  in  the  future  in  order  to  improve  our  knowledge. 
I'm  pleased  that  such  a  broad,  representative  group  of  people  has  chosen  to  come 
and  to  assist  us  in  this  task.  In  addition  to  the  improvement  of  our  research 
agenda,  we  want  to  stimulate  the  interest  of  the  research  community  in  the 
series  of  key  questions  which  will  be  addressed  today. 

My  ultimate  goals  for  this  symposium  are  fourfold.  I  hope  for  knowledge  and 
guidance  in  order  to  achieve: 

o       Technical  refinement  in  our  quality  of  care  measures; 

o       Clarification  of  additional  data  needs  and  potential  new  measures; 

o  Improvement  of  methods  for  monitoring  and  assuring  quality  of  care; 
and 

o  Development  of  improved  techniques  for  communicating  with  both 
providers  and  consumers  of  health  care  regarding  the  interpretation 
of  quality  of  care  indices. 

I  want  to  express  my  appreciation  to  the  participants  for  coming  here  today  and 
to  the  distinguished  audience  of  attendees  whose  participation  we  look  forward 
to  in  the  scheduled  presentations  and  workshops.  I  especially  want  to  thank  Dr. 
Gail  Wilensky,  Director  of  the  Center  of  Health  Affairs  at  Project  Hope,  who 
was  kind  enough  to  agree  to  moderate  this  symposium,  as  well  as  having  served 
as  moderator  of  the  conference  we  held  last  December. 

I  also  want  to  thank  the  staff  in  the  Health  Care  Financing  Administration 
(HCFA)  who  have  assisted  us  in  planning  this  symposium.  I  especially  want  to 
thank  Dr.  Joseph  Antos,  who  has  recently  taken  over  the  duties  as  Director  of 
HCFA's  Office  of  Research  and  Demonstrations,  and  the  staff  in  that  office  who 
have  worked  on  the  plans  for  this  symposium. 

Again,  I  am  very  appreciative  of  the  participation  of  both  the  presenters  and 
attendees.  I  look  forward  to  the  new  knowledge  that  comes  from  this  symposium 
to  help  in  the  direction  of  HCFA's  quality  of  care  agenda. 
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Summary  of  Research  Issues  in  Quality  of  Health  Care 
Addressed  at  the  Quality  of  Care  Research  Symposium 
June  11-12,  1987  in  Baltimore,  Maryland 

Feather  Ann  Davis,  Ph.D. 

Introduction 

During  the  past  year,  the  Health  Care  Financing 
Administration  (HCFA)  has  sponsored  a  number  of  meetings  on 
quality  of  care  measurement,  data  development  and  release, 
and  research  priorities.  In  June  1987,  HCFA  convened  a 
Quality  of  Care  Research  Symposium  to  provide  a  national 
forum  for  detailed  discussion  of  the  current  status  of 
quality  of  care  research  and  identification  of  research 
needs.  Fifteen  experts  in  various  aspects  of  quality  of 
care  measurement  and  analysis  gave  presentations,  followed 
by  four  smaller  work  groups  in  which  speakers  and  attendees 
pooled  their  ideas.  This  Summary  provides  an  overview  of 
the  issues  addressed,  the  general  themes  of  the 
discussions,  and  the  research  priorities  identified. 


Background 

The  past  decade  has  seen  a  proliferation  of  the  literature 
on  many  aspects  of  quality  assurance  and  utilization 
review.  Despite  the  marked  surge  in  the  volume  of  this 
literature,  the  calibre  has  varied,  and  neither  the 
conceptualization  nor  the  measurement  instrumentation  of 
health  care  quality  has  progressed  much  beyond  the  status 
of  the  early  1970s. 

Researchers  in  the  field,  indeed,  still  find  great  merit  in 
the  conceptual  framework  first  developed  by  Avedis 
Donabedian,  which  is  often  reduced  to  its  three  major 
components:  structure,  process,  and  outcome.  Donabedian 
has  noted  that  these  are  not  attributes  of  quality,  but  are 
approaches  to  the  acquisition  of  information  about  the 
presence  or  absence  of  the  attributes  that  constitute  or 
define  quality. 

Donabedian  proposed  an  integration  of  the  dimensions  of 
quality  and  its  analysis  through  an  emphasis  on: 

(1)  the  need  to  adequately  conceptualize  the 
components  of  health  (physical-physiological 
function,  psychological  function,  and  social 
function) ; 

(2)  the  levels  of  aggregation  and  organization  of  the 
providers  of  care;  and 
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(3)     the     levels    of    aggregation    of    the    actual  or 
potential  recipients  of  care. 

By  emphasizing  the  various  possible  levels  of  aggregation 
of  patients,  populations,  and  providers,  Donabedian  has 
begun  to  enumerate  the  broad  range  of  foci  necessary  to 
fully  assess  the  many  facets  of  medical  care  quality. 
(Donabedian,  1980.) 


Symposium  Issues  and  Themes 

The  speakers  at  the  Symposium  reaffirmed  the  mult i faceted 
nature  of  "quality"  and  reiterated  Donabedian' s  point  that 
because  there  is  no  single  definition  of  "quality"  for 
medical  care,  there  is  no  single  measure  of  it.  No  one 
composite  index  of  quality  of  health  care  has  been,  or 
probably  ever  can  be,  developed. 

The  Symposium  discussions  made  it  clear  that  there  is  a 
need  for  several  levels  of  analysis  and  monitoring. 
Participants  expressed  strong  support  for  the  use  of 
epidemiologic  techniques  for  monitoring  health  care  at  one 
level  of  aggregation,  and  the  development  of  diagnosis- 
specific  criteria  for  chart  review  at  another  level.  They 
expressed  equally  strong  support  for: 

(1)  epidemiologic  monitoring  of  patient  outcomes  such 
as  mortality,  morbidity,  and  disability; 

(2)  local  area  analyses  to  identify  problems  in  both 
patient  outcomes  and  health  care  processes;  and, 

(3)  analysis  and  monitoring  of  both  the  outcomes  and 
processes  of  care  at  the  institution  or  medical 
care  plan  level. 

While  historically  there  have  been  heated  debates  about  the 
relative  merits  of  structural  measures,  procedural 
criteria,  or  patient  outcome  measures,  the  speakers  at  the 
Symposium  supported  the  appropriateness  of  the  use  of  both 
process  and  outcome  measures  of  quality;  They  also 
believed  strongly  in  the  overriding  need  to  clearly 
identify  the  interrelationships  among  the  structural,  the 
procedural,  and  the  outcome  measures  of  quality  of  care. 


Data  Needs 

During  the  past  twenty  years,  progress  in  assessing  the 
quality  of  health  care  has  been  impeded  not  only  by  lack  of 
agreement   about  the   appropriate   indicators   of  good  care, 
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but  also  by  lack  of  detailed  data  bases  on  the  condition  of 
the  patient  and  the  care  provided  to  patients.  Data  in 
sufficient  detail  can  be  made  available  for  clinical  trials 
or  other  types  of  special  studies ;  but  in  the  day-to-day 
world  of  medicine,  data  are  seldom  adequate  for  analysis 
and  understanding  of  complex  processes  and 
interrelationships.  In  addition,  the  data  are  neither 
uniform  nor  easily  accessed,  while  errors  in  recording  and 
abstraction  are  other  long-recognized  problems.  The 
Symposium  participants  reaffirmed  the  importance  of  the 
collection  of  accurate  data. 

The  need  for  more  uniform  reporting  of  data  was  another 
common  plea — in  particular,  uniform  reporting  of  patient 
characteristics  and  a  uniform  clinical  data  base.  One 
high-priority  area  was  the  development  of  a  global  measure 
of  patient  physiologic  status  and  physiologic  reserve  that 
can  provide  a  reliable  determination  of  patient  prognosis 
(likelihood  of  responding  to  treatment) .  Such  a  measure 
does  not  represent  "quality,"  but  it  is  important  for 
interpreting  other  process  and  outcome  measures. 


Analysis  of  Imperfect  Data 

Dr.  Robert  Brook  opened  the  Symposium  with  the  provocative 
question:  "Will  imperfect  information  about  health  care 
quality  lead  to  better  health  or  will  it  lead  to  increased 
social  divisiveness?"  As  with  all  research,  the  readily 
available  health  care  data  do  not  always  adequately  measure 
the  desired  theoretical  concepts.  Therefore,  "quality  of 
care"  analyses  often  have  been  limited  to  those  aspects  for 
which  data  exist  or  can  be  developed  at  relatively  little 
expense.  While  hospital  discharge  abstract  data  bases  have 
facilitated  analyses  of  patient  length  of  stay  and  patient- 
based  mortality  and  readmission,  these  are  admittedly 
either  poor  proxies  for  "quality"  or  they  are  ambiguous 
regarding  interpretation. 

The  Symposium  participants  understood  the  concerns  of 
health  care  providers  regarding  the  use  of  imperfect 
measures  of  quality  of  care;  however,  several  speakers 
urged  that  research  and  the  dissemination  of  data  and 
research  results  not  await  the  perfect  data  base.  They 
pointed  out  that  a  number  of  existing  surveys  contain  data 
which,  if  linked  with  Medicare  utilization  data,  could 
provide  a  more  thorough  analysis.  It  was  also  emphasized 
that  sampling  is  an  appropriate  means  of  estimating 
phenomena  in  a  population,  and  that  surveys  can  be  used  to 
periodically  collect  specific,  necessary  data. 

Linkages  among  extant  data  bases,  such  as  tumor  registries, 
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state-maintained  death  certificates,  National  Center  for 
Health  Statistics  surveys  of  disease  incidence  and 
prevalence,  surveys  of  functional  status,  etc.  are 
feasible,  particularly  the  linkage  of  these  clinically- 
oriented  data  bases  with  the  utilization  data  from  the 
Medicare  statistical  system.  Such  linkages  are  not  always 
expensive.  Participants  felt,  however,  that  they  do  not 
provide  a  thoroughly  satisfactory  solution  to  the  need  for 
improved  and  expanded  routine  data  bases  for  the  types  of 
priority  data  described  above. 

The  Symposium  also  identified  a  need  for  information  on  all 
sectors  of  health  care.  including  ambulatory  care  in 
physician  offices,  home  health  care,  and  nursing  home  care 
as  well  as  the  frequently-studied  inpatient  care.  In 
particular,  given  the  increasing  interest  in  health 
maintenance  organizations  and  other  forms  of  capitated 
health  care  payment,  there  is  a  strong  demand  for 
information  on  both  the  capitated  and  f ee-f or-service 
sectors.  Dr.  Mark  Blumberg  emphasized  the  need  for 
measures  and  data  collection  which  permit  analysis  of 
comparability  across  provider  types  and  across  larger 
health  care  systems. 

The  participants  also  expressed  a  need  for  more  analysis  of 
the  quality  of  care  provided  to  specific  sub-populations. 
A  particular  concern  was  that  cost  savings  efforts  might 
motivate  providers  to  discriminate  in  the  types  of  patients 
they  accept  and/or  the  types  of  care  they  provide. 

While  there  is  a  need  to  analyze  the  appropriateness  of 
clinical  decisions,  there  is  also  clearly  a  role  for 
population-based  analyses  of  population  health  status, 
mortality,  morbidity,  disability,  and  health  care 
utilization  which  is  independent  of  the  adequacy  of  the 
medical  processes  employed  by  health  care  practitioners. 

Currently  there  are  no  "quality  of  care"  data  bases  on 
regional,  state,  or  federal  levels.  But  as  the  pressure 
for  comparable  information  on  health  care  "quality" 
measures  increases  so  that  consumers  may  make  enlightened 
decisions  regarding  selection  of  individual  providers, 
health  care  plans,  and  insurance  packages,  there  will 
likely  be  increased  pressure  for  better,  comparable  data 
bases. 

It  appears  that  researchers,  practitioners,  and  purchasers 
of  care  are  beginning  to  agree  that  is  desirable,  even 
imperative,  to  develop  information  on  attributes  felt  to 
reflect  aspects  of  quality  in  health  care.  This  heightened 
awareness,  verging  on  impatience,  should  further  the 
development    and    refinement    of        measures,     data  bases, 


1-4 


routine  monitoring  mechanisms,  and  the  analytic  skills  of 
all  concerned  parties.  It  is  generally  believed  that  these 
improvements  will  ultimately  improve  both  the  quality  of 
health  care  and  the  utilization  of  resources. 

In  summary,  the  thoughtful  presentations  and  discussions  of 
the  Symposium  acknowledged  that  while  there  is  reason  for 
concern  about  the  effects  of  incomplete  data,  there  is  also 
reason  for  moving  ahead  with  carefully-designed 
measurement,  analysis,  and  feedback. 

References 

Donabedian,   Avedis,   Explorations  in  Quality  Assessment  and 

Monitoring,     Volume    I:  The    Definition    of    Quality  and 

Approaches  to  Its  Assessment.  Ann  Arbor,  MI.  Health 
Administration  Press,  1980. 
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OVERVIEW;     Major  Issues  in  Quality  of  Care  Assessment 

*  The  health  care  research  community  has  some 
long-standing  questions  about  quality  of 
care,  ranging  from  the  goal  of  quality 
assurance,  through  the  methods  to  be  used, 
to  the  ultimate  impact  of  a  national  system. 

*  There  are,  however,  some  areas  of  general 
agreement,  such  as  the  effects  of  cost 
reduction  policies  and  the  existence  of  wide 
variations  in  quality,  where  action  is  now 
appropriate. 

*  The  research  community  now  needs  to 
coordinate  its  efforts  to  create  a  fair  and 
balanced  system  of  quality  assurance  that  is 
capable  of  raising  health  care  in  America  to 
a  higher  level. 


By  Robert  Brook,  M.D.,  D.Sc. 

Over  the  past  2  0  years,  the  health  care  research  community 
has  considered  a  number  of  questions  relative  to  quality  of 
care — and  is  still  considering  them  today.  I  would  like  to 
recapitulate  10  of  those  questions,  and  then  point  out  some 
areas  of  general  agreement  as  well  as  areas  where  more 
progress  is  needed. 

1.  What  is  the  goal  of  quality  assurance:  to 
improve  health,  to  decrease  costs,  or  to  document  that 
American  medicine  is  the  best  in  the  world?  Unfortunately, 
more  research  activity  has  been  devoted  to  the  last  two 
purposes  than  to  the  first. 

2.  What  data  do  we  use:  outcome,  process,  or 
structure? 

3.  Within  each  of  these  categories,  what  specific  data 
do  we  use?  Which  outcomes?  What  time  windows?  Which 
processes? 

4.  How  do  we  set  criteria,  through  explicit  or 
implicit  methods? 

5.  What  standards  do  we  use:  ideal,  average, 
empirical,  or  theoretical? 
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6.  Should  our  measurement  system  be  population-based 
or  patient-based?  Is  access  to  care  one  of  the  factors  we 
consider?    Do    we  focus  on  under-use,  or  over-use,  or  both? 

7 .  Can  results  be  tied  to  action?  What  does  one  do 
with  results  on  physician  performance  in  the  absence  of  a 
disease  model  that  relates  process  to  outcome?  What  does 
one  do  with  results  on  HMO  performance  without  a  model  that 
relates  the  structure  of  the  HMO  to  outcome? 

8.  Who  maintains  the  system?  At  present  we  do  not 
have  the  capability  to  maintain  a  quality  assurance  system 
on  either  a  local,  or  a  regional,  or  a  national  level. 

9.  How  do  we  change  behavior?  Through  education, 
through  changes  in  the  system,  or  (given  the  likelihood  of 
an  oversupply  in  the  next  few  years)  through  selection  of 
physicians?  Who  drives  the  system,  the  physicians  or  the 
public? 

10.  Finally,  will  it  do  more  harm  than  good?  From  the 
physician's  perspective,  there  is  a  risk  that  a 
performance-based  review  system  will  prove  a  straitjacket 
that  forces  physicians  to  practice  medicine  like 
automatons.  From  the  public's  perspective,  there  is  an 
even  greater  risk:  Will  imperfect  information  about 
quality  of  care  lead  to  better  health,  or  will  it  increase 
social  divisiveness? 

Many  of  these  questions  will  be  discussed  at  length  during 
the  seminar.  Rather  than  anticipate  those  discussions,  I 
would  like  to  move  on  to  some  areas  where  conclusions  are 
already  clear  and  action  is  appropriate. 


Economic  Factors 

Illustrating  the  first  point  I  wish  to  make  is  a  California 
study  of  the  effect  of  disenf ranchisement  on  medically 
indigent  adults.  When  the  Medical  program  in  which  they 
were  enrolled  was  terminated,  the  short-term  effect  was  a 
rise  of  10  millimeters  in  the  mean  blood  pressure.  Over 
the  long  term,  this  group  showed  a  high  rate  of  excess 
deaths  when  compared  to  the  control  group.  The  point  is 
that  changes  in  public  policy  can  affect  quality,  and 
changes  in  quality  can  either  kill  people — or  improve  their 
health.  This  is  not  a  theoretical  issue,  and  it  is  our 
responsibility  as  a  research  community  to  make  the  possible 
consequences  of  policy  changes  clear  to  policy-makers  and 
the  public. 
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The  second  point  is  that  an  economically-driven  system  of 
health  care  will  have  a  blunt  effect  on  quality.  Our  work 
in  the  Rand  Health  Insurance  Experiment  provides  an 
illustration,  through  our  study  of  the  impact  of  cost- 
sharing  on  the  use  of  ambulatory  care.  We  divided  all 
episodes  of  ambulatory  care  into  highly  effective,  quite 
effective,  and  rarely  effective  interventions,  and  we  asked 
whether  the  proportions  of  these  episodes  were  different 
for  groups  with  co-insurance  and  groups  receiving  free 
care.  We  found  relatively  the  same  decrease  in  all  three 
types;  in  other  words,  highly  effective  care  decreased  as 
much  as  rarely  effective  care  when  cost-sharing  was 
implemented.  Thus  it  seems  clear  that  a  system  designed  to 
reduce  costs  will  decrease  both  appropriate  and 
inappropriate  care  to  an  equal  extent. 

Variations 

My  next  point  can  best  be  illustrated  by  citing  specific 
instances : 

*  The  physician  performance  index  of  what  physicians 
think  ought  to  be  done,  which  was  used  by  Payne  in 
Hawaii,  averaged  70  (on  a  scale  of  0-100)  for 
hospitalized  patients  and  45  for  ambulatory  patients. 

*  The  New  Mexico  EMCRO  study  showed  that  50%  of  visits 
by  AFDC  women  and  children  were  associated  with 
injections,  that  50%  of  those  injections  were 
antibiotics,  and  that  7%  of  the  physicians  gave  over 
50%  of  all  injections. 

*  A  study  by  the  Rand  health  insurance  experiment  found 
that  40%  of  all  hospital  admissions  in  the  fee-for- 
service  system  seemed  to  be  either  not  acute  or 
discretionary . 

*  The  evaluation  of . the  NIH  Consensus  Program  suggested 
that  40%  of  coronary  angiograms  and  coronary  artery 
bypass  surgery  were  inappropriate  or  of  equivocal 
benefit.  Another  study  of  these  procedures  has  shown 
rates  that  vary  three-fold  across  areas  as  large  as 
states. 

*  One-third  of  carotid  endarterectomies  performed  in 
academic  VA  hospitals  have  been  evaluated  as 
inappropriate  or  equivocal. 

*  In  a  Vermont  study,  the  probability  that  a  man  had  a 
prostatectomy  by  the  age  of  80  varied  from  20%  to  80%, 
depending    on    where    he    lived    in    that    rather  small 
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state. 


One  could  continue  almost  indefinitely,  but  the  point  is 
that  problems  in  guality  and  appropriateness  are  not 
outlier  phenomena.  The  average  level  of  care  provided  to 
the  average  American  by  the  average  physician  is  something 
no  one  wants.  This  is  not  to  say  that  doctors,  or  the  fee- 
for-service  system,  or  HMOs  are  bad — but  it  is  to  say  that 
we  have  a  major  problem  with  deficiencies  in  care,  and  that 
the  variation  in  quality  of  care  is  extremely  large. 

What  Is  Needed? 

There  is  an  overwhelming  mass  of  data  indicating  that  we 
need  to  raise  the  entire  health  care  system  to  a  higher 
plane.  As  a  research  community,  we  must  turn  our  efforts 
to  creating  a  fair,  balanced  system  for  assessing  quality 
that: 

*  Detects  under-use  and  over-use; 

*  Assesses   outcome  conditional   on  the  performance  of  a 
procedure  or  delivery  of  a  service; 

*  Is  population-based; 

*  Is  based  on  the  best  possible  synthesis  of  knowledge 
and  judgment; 

*  Has   explicit   standards   and   criteria  that  are   in  the 
public  domain; 

*  Has  public  disclosure  of  results;  and 

*  Has    a    conceptual    model    that   ties    deficiencies  into 
corrective  or  preventive  actions. 

In  addition,  we  as  researchers  need  to  construct  this 
quality  of  care  system  so  that  it  will  counter  the  economic 
incentives  of 

whatever  method  we  use  to  pay  physicians.  Payment  on  a 
capitated  basis  creates  a  pressure  to  under-use,  while  the 
fee-f or-service  system  creates  a  pressure  to  over-use.  All 
of  us  know  that;  now  we  must  agree  that  the  quality  of  care 
system  should  create  a  counter-pressure.  And  we  must  work 
together  in  a  coordinated  way  to  produce  a  set  of 
standards,  criteria,  and  measures  that  will  achieve  this 
goal  and  be  valid  and  useful  on  a  national  basis. 
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*         *  * 

Dr.  Robert  Brook,  a  physician,  began  his  work  20  years  ago 
at  the  Johns  Hopkins  University.  He  has  continued  research 
through  his  years  with  the  Public  Health  Service  and  in  his 
present  capacity  as  Deputy  Director  of  the  Rand 
Corporation,  where  he  is  currently  the  co-principal 
investigator  on  a  major  study  of  the  effect  of  the 
prospective  payment  system  on  quality  of  care  for  the 
elderly.  Dr.  Brook  is  also  Chief  of  the  Division  of 
Geriatrics  and  Professor  of  Medicine  and  of  Public  Health 
at  the  UCLA  Center  for  the  Health  Services,  as  well  as 
being  director  of  the  Robert  Wood  Johnson  Clinical  Scholars 
Program  with  the  Department  of  Medicine. 
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OUTCOME  ASSESSMENT  FOR  THE  PURPOSE  OF  DETERMINING 
THE  HEALTH  STATUS  OF  THE  PUBLIC 


What  are  the  time  trends  (deterioration  and/or  improvement) 
of  the  health  status  of  the  population,  and  what  are  the 
problem  areas? 

*  Neither  global  epidemiological  measures  of 
outcome  and  mortality  nor  case-mix  studies 
are  satisfactory.  The  former  lacks 
information  on  service  venues  and  changes  in 
service  patterns,  while  the  latter  offers  a 
static  cross-section  of  a  single  service 
episode  that  does  not  consider  the  shifting 
health  status  of  the  population. 

*  A  balanced  approach  combines  the  two 
methods.  The  performance  of  a  given  service 
is  compared  to  general  mortality  trends.  If 
the  two  are  not  proportional,  case  mix  is 
studied  for  changes  in  system  behavior. 

*  One  way  to  achieve  this  balance  is  through 
linkage  of  the  National  Long-Term  Care 
Survey,  which  tracks  the  functional  health 
status  of  a  defined  population,  with 
Medicare  Part  A  data  showing  that 
population's  use  of  services. 


By  Kenneth  G.  Manton,  Ph.D. 

Changes  in  quality  of  care  must  be  assessed  against  some 
standard  or  index  of  the  prevailing  health  characteristics 
of  the  population  in  order  to  control  for  "real"  changes  in 
health  risks  due  to  either  greater  age  (i.e.,  a  change  in 
demographic  composition  to  higher  risk  groups)  or  to  higher 
health  risks  (e.g.,  an  increase  in  the  incidence  or 
prevalence  of  certain  diseases. 

There  are  two  fundamental  approaches i  controls  for  case 
mix  and  comparison  to  an  epidemiological  standard.  The 
first  involves  generating  an  index  for  the  medical  severity 
of     a     patient's     conditions.  To     be     effective  for 

reimbursement,  the  case  mix  index  must  describe  differences 
in  resource  consumption.  To  be  effective  in  standardizing 
in  quality  of  care  assessments,  it  must  reflect  the 
patient's   risk    level    for   adverse,   outcomes.      Current  case 
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mix  indices  such  as  DRGs  have  generally  been  designed  for 
the  former  purpose;  the  latter  is  intrinsically  more 
difficult.  This  is  a  micro-level  approach  that  does  not 
require  knowing  the  population  distribution  of  health 
problems . 

Comparison  to  an  epidemiological  standard  is  a  macro-level 
approach,  and  it  involves  comparisons  of  changes  in  outcome 
against  changes  in  population  health.  Thus  if  mortality 
for  lung  cancer  (largely  an  untreatable  condition) 
increases,  we  would  expect  an  increase  in  mortality  as  an 
outcome  even  if  quality  of  care  has  improved. 

A  detailed  assessment  will  require  both  approaches.  First, 
one  would  conduct  a  macro-level  comparison  to  see  if 
overall  performance  is  maintained  (e.g.,  whether  an 
increase  or  decrease  in  hospital  mortality  is  proportional 
to  overall  trends)  .  If  it  is  not  proportional,  one  will 
then  assess  how  system  behavior — that  is,  where  people  go 
for  service — has  changed  by  examining  case  mix  for  specific 
service  types. 

System  behavior  can  be  affected  by  a  change  in 
reimbursement  strategy.     This  can  happen  in  three  ways: 

*  Change  in  service  site.  A  familiar  example  is  the 
decline  in  hospitalization  rates,  which  suggests  that 
people  are  now  being  treated  elsewhere.  This  will 
affect  the  hospital-based  mortality  rate  if  it  is 
systematic. 

*  Changes  in  duration  of  service  use.  Again,  we  see 
that  those  people  who  do  still  go  to  hospitals  stay 
there  for  a  shorter  time. 

*  Changes  in  actual  outcome.  This  type  of  change 
suggests  that  resource  constraints  have  affected  the 
outcome  of  care.  We  must  control  for  both  changes  in 
service  use  patterns  and  the  medical  status  of  the 
patient  in  analyzing  our  data. 

Case-mix  strategies  have  a  limited  ability  to  deal  with 
these  problems.  First,  they  are  oriented  to  reimbursement 
and  therefore  focus  on  one  service  type  only.  Second,  they 
do  not  deal  with  change  and  duration  because  they  were 
originally  constructed  as  static  cross-sections  of  case  mix 
characteristics.  The  epidemiological  approach  offers  a 
more  viable  alternative  because  it  describes  system 
performance,  including  quality  of  outcome,  not  in  terms  of 
one  service  setting  but  in  terms  of  the  total  performance 
for  the  population. 
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Illustrations  of  an  Epidemiological  Approach 


We  can  now  illustrate  and  discuss  some  of  the  data  and 
methods  that  could  be  used  in  a  population-based  assessment 
of  quality  outcome.  To  begin  at  the  most  macro  level,  it 
is  useful  to  look  at  what  is  happening  with  total  mortality 
in  the  country.  Figure  1  shows  monthly  total  mortality 
rates  for  the  years  1982  through  1986,  and  we  see  that  the 
rates  moved  both  up  and  down  over  this  period.  We  do  not, 
however,  know  why. 

If  we  focus  on  1982-86  in  terms  of  mortality  rates  for 
persons  aged  65  to  74  (Figure  2)  ,  we  see  an  encouraging 
trend  for  this  large  population  group.  An  analysis  of 
total  mortality,  however,  is  not  fully  satisfactory  because 
trends  are  driven  by  changes  in  the  cause  of  death. 

These  changes  are  products  of  both  short-  and  long-term 
fluctuations.  One       short-term       factor       is  the 

pneumonia/ influenza  cycle,  which  is  shown  in  Figure  3  for 
the  years  1975  to  1985.  This  effect  is  echoed  in  the  rates 
for  heart  disease  and  chronic  respiratory  disease,  which 
are  often  due  to  pneumonia/influenza  complications. 

Figure  4  shows  long-term  changes  in  the  14  most  common 
causes  of  death — a  complex  picture  with  both  increases  and 
decreases.  Some  of  these  causes  have  greater  potential  for 
successful  medical  intervention  than  others.  Lung  cancer, 
which  is  a  major  component  of  male  mortality  rates,  is 
relatively  invulnerable  to  intervention.  More  amenable  to 
treatment  are  heart  disease  and  stroke,  both  of  which  show 
dramatic  declines — although  these  may  in  part  be  due  to 
lifestyle  changes  as  well. 

One  point  that  is  not  often  recognized  is  the  prominent 
role  of  cohort  differences  in  recent  changes  in  mortality 
rates.  Figures  5  and  6  show  the  changes  for  stroke  and 
cancer  respectively  in  different  birth  groups.  Of  interest 
is  the  fact  that  the  oldest  cohorts  have  the  highest 
mortality  for  stroke,  but  lower  mortality  for  cancer;  the 
rates  can  move  in  either  direction. 

We  also  tend  to  forget,  when  we  look  at  national 
comparisons,  that  there  is  considerable  variation  by 
locality.  Figure  7  shows  all  cancer  mortality  rates  for 
white  males  from  1970  to  197  0,  broken  out  by  county  across 
the  United  States.  The  variation  of  the  age-standardized 
risk  for  this  large  and  heterogeneous  group  of  diseases  is 
over  2  to  1. 

Thus  we  see  that  an  epidemiological  approach  makes  a 
comparison  to  a  moving  target — and  a  target  that  is  being 
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driven  by  a  number  of  forces  that  we  must  identify  and 
extract  in  order  to  decide  whether  the  comparison  to  the 
baseline  is  an  appropriate  one. 

The  Dynamics  of  Mortality 

The  above  data  are  useful  for  setting  a  context.  They  are 
limited  because  they  are  not  related  to  specific  service 
settings.  To  overcome  this  deficiency,,  we  have  taken 
National  Long-Term  Care  Survey  files  for  the  chronically 
disabled  community-based  population  on  two  dates  in  1982 
and  1984,  and  linked  them  to  Medicare  Part  A  data  for 
service  use  of  hospitals,  skilled  nursing  facilities,  and 
home  health  agencies.  This  gives  us  a  data  base  that 
matches  the  detailed  health  characteristics  of  a  nationally 
representative  sample  to  the  use  of  various  service  venues. 

The  question  is  how  to  analyze  such  data  so  as  to  deal  with 
the  complexities  of  a  population  that  moves  back  and  forth 
among  various  services.  What  we  have  done  is  to  decompose 
the  time  line  for  service  use  into  episodes  of  different 
types.  We  then  calculated  life  table  measures  of  exposure 
and  change  for  these  different  service  types. 

This  is  illustrated  in  Figure  8,  which  shows  hospital 
episodes  for  the  chronically  disabled  that  ended  in  a 
discharge  to  home  health  agencies.  Figure  9  gives  a 
broader  view,  showing  different  types  of  episodes  and  their 
resolution.  The  mortality  outcome  turned  out  to  be  a 
surprisingly  sensitive  measure,  indicating  reductions  which 
reflect  the  overall  pattern  of  shifting  mortality  level. 

An  alternative  approach  is  to  redefine  the  episodes.  In 
Figure  10  we  have  ignored  intermediate  transitions  and  only 
examined  the  time  from  hospital  entry  to  readmission  or 
death.  The  non-response  group  shown  in  the  chart  proved  to 
be  an  extremely  ill  sub-population  who  did  not  respond  to 
the  National  Long-Term  Care  Survey  because  of  high 
mortality  and  morbidity. 

Finally,  another  way  to  examine  the  data  is  through  graphic 
representations  as  illustrated  in  Figures  11-13. 

Conclusion 

Global  measures  of  mortality  using  the  vital  statistics 
data  are  not  fully  satisfactory  because  they  do  not  relate 
outcome  to  service  use  and  shifting  service  patterns.  The 
alternative  presented  here — linking  the  NLTCS  sample  and 
Medicare  Part  A  data— gives  a  more  complete  picture, 
controlling  more  rigorously  for  both  patient 
characteristics  and  service  use  changes  that  are  driven  by 
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reimbursement.  We  are  then  able  to  examine  outcome 
condition  in  a  way  that  displays  the  dynamics  of  service 
change . 


*        *  * 

Kenneth  G.  Manton,  Ph.D.,  is  Research  Professor  of 
Demographic  Studies  and  Medical  Research  Professor, 
Department  of  Community  and  Family  Medicine,  Duke 
University  Center.  In  addition  to  his  university  post,  Dr. 
Manton  is  head  of  the  World  Health  Organization 
Collaborating  Center  for  Research  and  Training  in  the 
Methods  of  Assessing  Risk  and  Forecasting  Health  Status 
Trends.  He  also  serves  as  Associate  Editor  for  the  Journal 
of  the  American  Statistical  Association  and  Editorial 
Advisor  for  Demography. 
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FIGURE  H 

>  Monthly  Vital  Statistics  Report ; 
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FIGURE  6 


AGE  SPECIFIC  MORTALITY  RATES  OF  WHITE  MALE  COHORTS 

UNTEFIYINC  CAUSE  •  SOUK  TUMOURS 
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FIGURE  9 


Medical  Utilization  1982-1984  NLTCS 


Medicare  Hospital  Episode    Medicare  SNF  Episode        Medicare  HHA  Episode 
1  2  3 
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FIGURE  10 


Weighted  Empirical  Life  Table  Values,  70  and  e0  for  Hospital-Nonhospital  Episodes 


All  CauggS         Hosi).  Readmlssion  EOS  Death 


All  Subpopulations: 

1982  100.000  115.78  35,274  78.86  53,784  156.57  10,942  34.32 

1984  100,000  119.55  33,453  74.98  55.906  162.45  10,641  34.30 

Community  Disabled: 

1982  "  100.000  108.18  39.318  79.11  48,588  151.01  12,094  30.64 

1984  100.000  112.41  38,366  80.39  49,173  157.74  12.461  32.11 

Community  Disabled 
Noncompleters: 

1982  100.000  100.26  37,123  77.40  52,105  133.94  10,772  16.18 

1984  100,000  84.22  38,753  44.28  42,262  145.20  18,986  29.98 

Institutionalized: 

1982  100.000  109.32  31,766  83.93  45,170  159.65  23.064  45.72 

1984  100,000  102.34  32,547  76.39  42,397  157.49  25,056  42.75 

Community 
Nondisabled: 

1982  100.000  125.93  31,605  77.47  61,332  161.49  7,063  34.00 

1984  100,000  126.65  30,587  72.16  61,679  165.34  7.735  33.59 
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OUTCOME  ASSESSMENT  FOR  THE  PURPOSE  OF  DETERMINING 
THE  EFFECTIVENESS  OF  HEALTH  CARE  INTERVENTIONS 


What  is  the  most  effective  health  care  intervention?  What 
to  buy? 

*  The  Medicare  End-Stage  Renal  Disease  (ESRD) 
program  makes  an  excellent  research  model 
because  the  population  is  clearly  defined, 
the  treatment  options  are  limited,  and  an 
important  component  of  the  total  cost  is 
known . 

*  Through  the  ESRD  model,  we  can  measure 
outcomes  for  different  treatment  modalities, 
facilities,  quality  of  life,  and  cost. 

*  The  model  will  become  even  more  useful  with 
the  establishment  of  the  national  ESRD 
patient  registry,  which  gives  us  the  data 
base  for  developing  policies  and  plans  for 
the  future. 


By  Christopher  Blagg,  M.D. 

When  the  Medicare  End-Stage  Renal  Disease  (ESRD)  program 
was  instituted  in  1972,  it  was  talked  about  as  a  microcosm 
of  what  a  national  health  program  might  be.  I  think  it  is 
also  a  microcosm  of  what  can  be  done  in  relating  outcome  to 
cost  and  quality  of  care. 

Because  the  program  provides  coverage  for  almost  the  entire 
ESRD  population  in  the  United  States,  it  has  a  number  of 
advantages  as  a  model.  The  population  is  clearly  defined, 
the  treatment  options  are  limited  (either  kidney 
transplantation  or  some  form  of  dialysis)  ,  and  the 
providers  are  limited  as  well.  In  addition,  the  Medicare 
portion  of  the  cost  is  known,  although  we  do  not  have 
comparable  information  for  other  costs  to  Medicaid,  private 
insurance  firms,  state  kidney  programs,  patients,  and  other 
resources. 

We  are  thus  is  a  good  position  to  assess  outcomes  in  terms 
of  survival,  quality  of  life,  and  cost.  In  addition,  the 
Health  Care  Financing  Administration  has  done  extensive 
studies  on  incidence  and  prevalence,  and  the  findings  raise 
some  questions  that  would  bear  further  investigation.  For 
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example,  Table  1  shows  that  males  get  transplanted  more 
often  than  females,  and  blacks  get  transplanted  less  often 
than  whites. 


Survival  Outcomes 

Our  major  interest,  however,  is  in  outcome,  and  patient 
survival  data  are  helpful  in  this  respect  because  obviously 
what  we  try  to  do  is  to  prevent  our  patients  from  dying. 
Survival  of  patients  depends  on  a  number  of  interrelated 
factors  such  as  age,  diagnosis,  complications  and  modality 
of  treatment.  Figure  1,  from  our  own  study  of  almost  1,000 
ESRD  patients,  shows  that  the  cause  of  kidney  failure  is  a 
significant  factor  in  survival.  When  one  compares  survival 
on  dialysis  with  that  following  cadaver  transplantation  and 
living-related  donor  transplantation,  at  first  glance  it 
would  seem  that  transplantation  is  clearly  superior,  but 
there  are  other  factors  to  be  considered  (Figure  2) . 

Many  dialysis  patients  are  those  who  would  never  have  been 
considered  candidates  for  transplantation;  others  must 
spend  time  in  dialysis  while  they  wait  for  a  suitable 
donor,  and  this  time  should  be  credited  to  the  dialysis 
survival  line  (Figure  3)  .  In  addition,  one  must  factor  in 
the  age  effect  (Figure  4)  and  the  effect  of  the  number  of 
associated  diseases  the  patient  has,  as  illustrated  in 
Figure  5.  When  analyzed  in  this  way,  using  the  Cox 
proportional  hazards  model  to  develop  coefficients  for  the 
different  covariates  that  affect  survival,  and  taking  into 
account  the  time  dependency  on  dialysis,  it  becomes 
apparent  that  there  is  no  difference  in  the  survival  rates 
for  dialysis  and  cadaver  transplantation  (Table  2)  .  A 
living-related  donor  provides  the  best  match  and  therefore 
the  best  chance  of  survival.  Lacking  this  possibility,  it 
becomes  a  question  of  quality  of  life  in  deciding  whether 
to  be  transplanted  or  to  stay  on  dialysis. 

It  is  also  possible  to  look  at  the  characteristics  of  the 
facilities  providing  care.  For  example,  Table  3  shows  that 
patients  in  free-standing  facilities  have  a  somewhat  lower 
risk  of  death  than  patients  who  are  dialyzed  in  hospital 
outpatient  dialysis  units.  This  same  study  also  compared 
facilities  that  do  and  do  not  re-use  dialyzers — an 
important  political  issue — and  found  better  survival  rates 
with  re-use. 

Figures  6  and  7  are  from  a  British  study  of  transplantation 
in  28  centers  which  showed  that  the  survival  rate  varied 
widely.  To  find  the  reasons,  the  researchers  analyzed  six 
of  the  centers  in  depth  and  found  that  the  variation 
depended  on  two  main  factors:     the  center's  policy  on  blood 
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transfusion  before  transplantation,  and  the  way  in  which 
drugs  were  used  to  treat  rejection  and  whether  patients 
died  from  these  medications. 


Quality  of  Life  Outcomes 

Quality  of  life  is  obviously  an  important  factor,  because 
it  determines  to  some  degree  what  treatments  we  should 
encourage.  Table  7,  taken  from  a  paper  which  looks  at  the 
quality  of  life  in  dialysis  and  transplant  patients,  shows 
that  it  is  possible  to  adjust  for  case  mix  and  then  compare 
different  modalities  of  treatment.  Again,  transplantation 
emerges  as  superior,  although  there  are  differences  among 
dialysis  patients.  In  general,  patients  who  dialyze  at 
home  appear  to  enjoy  a  better  quality  of  life  than  those 
who  dialyze  in  a  facility. 


Cost  Outcomes 

Looking  at  the  cost  of  different  forms  of  treatment  can 
enable  us  to  make  projections.  For  example,  Krakauer  at 
HCFA  has  analyzed  the  costs  to  Medicare  of  different 
options  related  to  the  use  of  cyclosporine  after 
transplantation  and  the  effect  this  will  have  on  the 
program  in  the  future. 

The  same  approach  is  possible  in  looking  at  the  relative 
roles  of  dialysis  and  transplantation.  A  lot  more  patients 
are  being  transplanted  in  the  last  two  or  three  years,  and 
as  a  result  there  will  be  many  fewer  dialysis  patients  in 
the  20-50  age  group.  That,  you  would  think,  would  have 
cost  benefits.  But  the  corollary  is  that  we  will  be 
treating  an  older  and  sicker  patient  population  by 
dialysis,  which  may  to  some  extent  counteract  these 
beneficial  effects. 

One  area  that  combines  both  quality  of  life  and  cost 
questions  concerns  erythropoietin,  a  hormone  produced  by 
the  kidney,  the  absence  of  which  causes  the  anemia  present 
in  almost  all  dialysis  patients.  This  hormone  has  now  been 
synthesized,  and  will  probably  be  generally  available 
within  the  next  few  years.  It  will  be  expensive,  but  it 
dramatically  improves  the  well-being  and  potential  for 
rehabilitation  among  dialysis  patients.  What  will  the  cost 
impact  be?  Who  will  pay  for  it?  How  will  it  affect 
rehabilitation?  These  are  all  issues  that  must  be 
considered. 
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Looking  Ahead 


The  United  States  has  tended  to  lag  behind  Europe,  Canada, 
Australia,  and  New  Zealand  in  terms  of  ESRD  data  systems. 
Presently  we  have  an  ESRD  Medical  Management  System  in  HCFA 
which  contains  much  useful  data,  although  outside 
researchers  have  found  it  very  difficult  to  gain  access. 
The  National  Institutes  of  Health  (NIH)  supports  a  registry 
of  patients  on  continuous  ambulatory  peritoneal  dialysis, 
and  a  transplant  registry  is  being  developed.  Now,  for  the 
first  time,  we  have  legislated  a  national  ESRD  patient 
registry,  to  be  run  jointly  by  NIH  and  HCFA,  which  we  hope 
will  provide  the  data  base  for  developing  policies  and 
plans  for  the  future,  as  well  as  for  research  purposes. 

With  the  national  registry,  we  will  be  able  to  look  more 
accurately  at  patients,  modalities  of  treatment, 
facilities,  and  other  variables — and  what  this  means  in 
terms  of  quality  of  care  is  that  we  can  begin  to  determine 
which  programs  are  less  than  good  and  identify  their 
problems.  We  will  be  able  to  conduct  highly  specific 
scientific  and  other  studies  with  this  patient  population, 
and  we  will  have  a  linkage  between  patient  data  and  cost 
data  that  should  make  it  possible  for  us  to  judge  whether 
the  reduction  in  reimbursement  of  the  past  few  years,  and 
the  changes  that  facilities  may  have  to  make  in  response, 
may  have  adverse  effects  on  patient  care. 

In  conclusion,  to  place  the  future  in  the  perspective  of 
the  past,  humanity  has  been  searching  for  accurate  outcome 
data  for  a  very  long  time.  I  refer  you  to  Psalm  39,  Verse 
4:  "Lord,  make  me  know  mine  end,  and  the  measure  of  my 
days,  what  it  is;  that  I  may  know  how  frail  I  am." 


*        *  * 

Dr.  Christopher  Blagg  is  a  physician  and  Executive 
Director  of  the  Northwest  Kidney  Center  in  Seattle,  and 
Professor  of  Medicine  at  the  University  of  Washington 
Medical     School.  He     has     worked     actively     for  the 

establishment  of  a  national  end-stage  renal  disease 
registry  in  the  United  States  and  played  a  major  role  in 
the  recent  enactment  of  federal  legistation  authorizing  the 
registry. 
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Table  1 


_    Distribution  of  Medicare  transplants  by  age,  sex,  race  and 
primary  cause  of  renal  failure:  by  donor,  1985. 


Donor  Source 
Cadaver         Live  Related    Total  Percent 
Group  Number  Percent   Number  Percent   Number  Live  Related 

Age 


0  TO  14 

187 

4% 

168 

10% 

360 

47% 

15  10  24 

590 

12% 

327 

20% 

936 

35% 

25  TO  34 

1,182 

23% 

537 

32% 

1,771 

30% 

35  TO  44 

1,435 

28% 

348 

21% 

1,830 

19% 

45  TO  54 

1,102 

22% 

208 

13% 

1,339 

16% 

55  TO  64 

534 

11% 

63 

4% 

617 

10% 

65  TO  74 

50 

1% 

3 

0% 

53 

6% 

75  + 

4 

0% 

0 

0% 

4 

0% 

Mean  Age 

38.8 

31.5 

37.0 

85% 

Male 

3,245 

64% 

969 

59% 

4,321 

22% 

Female 

1,839 

36% 

685 

41% 

2,589 

26% 

Race 

Unknown 
White 
Black 
Other 

Primary  Disease 
Congenital  Anomalies 

Diabetes 
Glomerulonephritis 
Hypertension 
Other  Urinary 
 12 


6 

0% 

0 

0% 

7 

0% 

3,666 

72% 

1387 

84% 

5,180 

27% 

1,197 

24% 

206 

12% 

1,437 

14% 

215 

4% 

61 

4% 

286 

21% 

523 

10% 

166 

10% 

704 

24% 

912 

18% 

320 

19% 

1,268 

25% 

1,605 

32% 

572 

35% 

2,235 

26% 

684 

13% 

125 

8% 

832 

15% 

207 

4% 

90 

5% 

302 

30% 

1.152 

13% 

381 

13% 

l.?69 

m 

5,084 

100% 

1,654 

100% 

6,910 

24% 

Total 

Source:  ESRD-PMMES,  Bureau  of  Data  Management  and  Strategy 
Health  Care  Financing  Administration 
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Table  2 


Relative  risk  according  to  type  of  analysis.    Reprinted  with 
permission  from  Vollmer,  WN,  Wahl,  PW,  B  agg,  CR:  Survival 
with  dialysis  and  transplantation  in  patients  with  end-stage 
renal  disease.    N  Engl  J  Med  308:1553-1558,  1983. 


Treatments  Compared  *  Tm  Of  Analysis 

TIME-INDEPENDENT    TIME -DE  PEN  DENT     TIME -DEPENDENT 
(UNADJUSTED)  (UNADJUSTED)         (ADJUSTED)  t 

LRD  with  dialysis  0  23  *  0.29  J  0.55  § 

CAD  with  dialysis  0  53  1  0.58  §  101 

LRD  with  CAD  0  43  1  0  50  §  0  54  § 

•LRD  denotes  living  relaied  donor,  and  CAD  cadaveric  donor, 
t  Adjusted  for  age.  number  of  associated  diseases,  and  year  of  entry 
tOne  sided  P  value  less  than  0  001 
♦One-sided  P  value  between  0.01  and  0  05 
•One-sided  P  value  between  0.001  and  0  01 


Table  3 

Relative  risk  of  death  for  selected  coordinating  dialysis  uni 
(CDU)  characteristics.    Reprinted  with  permission  from  Held, 
PJ,  Pauly,  MV,  Diamond,  L:    Survival  analysis  of  patients 
undergoing  dialysis.    JAMA  257:656-650,  1987. 


Relative  Risk 
CDU  Characteristics  of  Death* 


Freestanding  dialysis  units  vs 
hospital  centers 

For-prolit  088  <03 

Nottor-prelit  0.78  <0° 

Large  dialysis  unit  vs  small 
unit  (75  vs  25  patents)  0.89  <  00 

Reuse  of  dialy2ers  vs  never 
reused 

Multiple  uses  started  at 
or  prior  to  i960  0.86  <  03 

Multiple  uses  started 
alter  1 980 

Open  starting  vs  closed 
starting 

>1  medical  group  practicing 
in  unit  vs  only  1  group  0  69  <  «' 

No.  of  physicians  on  start  (6 
vs  5) 

Median  lamily  income  (1979) 

in  county  ol  unit  ($22  000 

vs  $21  000) 
Concentration  of  dialysis 

markeU  (4  vs  6  dialysis 

units)  1  03  <  07 


1.01  <B8 
1  20  <  00 


1.02  <00 
1.01  <  34 


•Otrier  covanaies  neld  constant 
tA  measure  ol  competition. 
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OUTCOME  ASSESSMENT  FOR  THE  PURPOSE  OF 
EPIDEMIOLOGICAL  SURVEILLANCE 

How  can  one  use  population-based,  cross-sectional,  and 
longitudinal  analyses  of  disease  incidence,  mortality, 
morbidity,  and  disability  as  screens  for  peer  review? 

*  Differences  in  physician  practice  patterns 
are  an  important  source  of  variation  in 
outcomes.  Our  task  is  to  use  epidemiologic 
data  to  change  physicians'  perceptions. 

*  Physicians  have  no  epidemiologic 
information;  they  see  only  their  own 
practices.  Often,  providing  this 
information  on  an  ongoing  basis  is  enough  to 
effect  change. 

*  Epidemiologic  studies  also  serve  as  a 
screening  device,  enabling  physicians  to  ask 
the  right  questions  about  high-variation 
procedures . 

*  To  be  effective,  study  results  must  be 
communicated  in  a  format  that  physicians 
will  understand  and  accept. 


By  Philip  Caper,  M.D. 

Using  epidemiologic  information  to  influence  what 
individual  physicians  do  is  not  an  easy  task.  I'd  like  to 
report  on  some  success  stories,  and  in  doing  so  I  will  also 
talk  about  the  use  of  claims  data  and  hospital  discharge 
data  to  monitor  outcomes  and  the  utilization  of  medical 
care. 

There  are  both  advantages  and  disadvantages  to  this 
approach.  The  advantages  are  low  cost,  the  relative  ease 
of  longitudinal  follow-up,  and  the  ability  to  study  both 
inpatient  and  outpatient  care  with  claims  data.  The  major 
disadvantage  is  a  lack  of  detailed  information  on  such 
important  elements  as  co-morbidity  and  functional  health 
status.  Therefore,  in  my  opinion,  this  approach  is  most 
useful  as  a  screening  device  which  points  the  way  to  areas 
for  further  investigation. 

In  studying  hospital  discharges,  we  must  first  consider  the 
sources  of  variations.  These  can  be  classified  into  three 
types: 
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*  Type  I  is  associated  with  patient  factors  such  as 
illness  rates  and  access  to  care. 

*  Type  II  has  to  do  with  physician  uncertainty,  such  as 
differences  in  diagnostic  nomenclature  and  differing 
beliefs  in  the  effectiveness  of  treatment. 

*  Type  III  concerns  the  site  of  treatment,  in  or  out  of 
the  hospital,  which  is  also  a  physician  decision. 

Systematic  errors  of  coding  can  also  account  for 
differences  in  the  data.  Random  noise  based  on  sample  size 
and  small  number  problems  are,  we  believe,  not  important 
causes  of  variations  in  our  studies. 

To  focus  on  the  Type  II  variation  in  more  detail:  we 
believe  that  it  exists  no  matter  what  else  exists.  In  New 
England,  where  the  populations  are  relatively  homogeneous 
with  respect  to  illness  rates  and  socio-economic  status, 
our  studies  have  still  found  wide  variations  due  to 
differences  in  physician  practice  styles,  and  we  believe 
that  these  variations  also  exist  in  areas  where  there  are 
large  differences  in  illness  rates.  Thus  we  consider  Type 
II  as  virtually  a  constant,  and  we  are  concentrating  our 
efforts  on  it. 


Case  History:     Pediatric  Admissions 

Figure  1  is  taken  from  the  Maine  Medical  Assessment 
Project,  and  it  shows  hospitalization  rates  for  four 
pediatric  DRGs.  In  one  service  area,  the  rate  of 
hospitalization  was  more  than  double  the  state  average. 
When  the  pediatricians  at  the  local  hospital  were  given 
this  information,  the  rate  began  to  decline.  We  feel  that 
two  factors  were  involved. 

First,  the  pediatric  staff  was  told  about  the  high  rate. 
That  is  a  most  important  point,  because  1  practicing 
physicians  have  no  way  of  knowing  what  their  rates  are 
relative  to  their  colleagues.  They  do  not  think  in 
epidemiologic  terms — in  terms  of  a  denominator.  They  are 
focused  on  the  numerator — the  individual  patients  who  walk 
through  the  door.  When  systematic  information  about  the 
population  at  risk  is  fed  back  to  them,  they  are  surprised 
by  it. 

Second,  the  chief  of  service  at  the  hospital  began  to  post 
the  names  of  patients  being  admitted  and  the  admitting 
physician  in  the  doctors'  lounge.  This  form  of  monitoring, 
even  though  it  was  informal  and  internal,   also  contributed 
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to  the  decline  in  hospitalization — because  when  the  service 
chief  retired,  the  rates  began  to  rise,  emphasizing  the 
need  for  an  ongoing  source  of  monitoring  and  feedback.  The 
notices  in  the  lounge  have  since  been  reinstituted,  and  the 
rates  are  dropping  again.  Variations  in  pediatric 
admission  rate  are  often  examples  of  the  Type  III 
variation. 


Case  History:     Hospital  Construction 

The  example  shown  in  Figure  2  concerns  a  hospital  in 
Vermont  which  knew,  because  of  a  population-based  data 
utilization  system  in  the  state,  that  they  had  very  high 
use  rates.  They  had  always  attributed  their  high  rates  to 
the  belief  that  they  served  an  older  population  than  other 
hospitals. 

The  hospital  submitted  a  certificate  of  need  proposal  to 
replace  their  existing  facility  at  the  same  bed  level.  The 
state  suggested  that  they  would  improve  their  chances  of 
approval  if  they  could  better  understand  their  utilization 
rates.  They  hired  us  to  do  a  very  detailed  utilization 
analysis  and  compare  it  with  age-specific  and  diagnosis- 
specific  utilization  rates  in  the  rest  of  the  state. 

We  learned  several  interesting  things.  First,  their  belief 
that  most  of  the  high  utilization  was  in  the  older  age 
ranges  was  not  borne  out;  a  great  deal  of  excess 
utilization  came  in  the  middle  age  ranges  and  was  by 
privately  insured  patients,  especially  in  certain  surgical 
case  mixes.  Second,  they  did  have  a  problem  with  the  very 
old  elderly,  in  that  it  was  difficult  to  find  long-term 
care  placements.  This  showed  up  in  an  extremely  high 
average  length  of  stay  in  that  age  group.  The  upshot  was 
that  the  medical  staff  and  administration  agreed  that  a 
smaller  inpatient  facility  would  be  appropriate  if  they 
could  acquire  some  outpatient  treatment  and  long-term  care 
capacity. 

The  point  is  that  people  will  respond  to  data  if  it  is 
credible,  if  it  is  presented  in  a  form  which  they  can 
understand,  and  if  it  does  not  too  sharply  dispute  their 
own  instincts  about  what  is  happening.  This  is  an  example 
of  the  use  of  epidemiologic  data  to  educate  and  to  change 
perceptions . 


Case  History:  Prostatectomy 

Prostatectomy  is  a  high-variation  cause  of  hospital 
admission,  and  we  know  that  all  prostatectomies  are  done  in 
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the  hospital.  Thus  a  study  of  hospital  discharge  records 
will  yield  complete  data  for  the  service  area. 
Prostatectomy  rate  variations,  therefore,  are  usually 
examples  of  Type  II  variations. 

The  Maine  Medical  Assessment  Project's  urology  group 
decided  to  evaluate  the  outcomes  of  this  high-variation 
procedure,  through  a  study  that  encompassed  most  of  the 
urologists  in  the  state.  They  used  both  the  Medicare 
claims  data  base  and  a  survey  study  to  look  at  mortality, 
readmission,  re-operation,  and  functional  health  status. 
Figures  2-4  show  some  of  the  results. 

The  mortality  rates  in  Figure  3  shows  a  death  rate 
approximately  three  times  as  high  as  the  figures  in  the 
urologic  literature.  The  reason  is  that  most  of  the  deaths 
occurred  after  hospital  discharge,  a  period  not  covered  in 
most     published     studies.  This     shows     the     value  of 

longitudinal  follow-up  using  claims  data. 

The  overall  mortality  rate  varied  three-fold  among 
hospitals,  with  small  hospitals  having  an  odds  ratio  of 
deaths  about  twice  as  high  within  three  months  of  surgery. 
The  data  do  not  tell  us  why.  They  do,  however,  enable  us 
to  ask  the  right  questions;  they  function  as  a  screen. 

Finally,  the  probability  of  re-operation  within  eight  years 
varies  with  the  type  of  surgery.  For  both  cancer  and  non- 
cancer  patients,  it  is  higher  with  the  trans-urethral 
approach  than  with  the  super-pubic  approach.  Almost  90%  of 
prostatectomies  today  are  trans-urethral,  even  though  there 
have  been  no  good  studies  comparing  the  two  methods.  This 
is  an  example  of  a  clinical  technology  which  has  been 
adopted  without  full  examination  of  the  alternatives. 

Figure  4  shows  what  happened  to  prostatectomy  rates  after 
publication  of  the  study  results.  They  began  to  decline. 
This  decline  had  nothing  to  do  with  cost  containment;  it 
was  purely  a  quality  issue.  The  questions  raised  by 
epidemiologic  research  led  to  a  change  in  physicians' 
perceptions  about  the  appropriateness  of  this  procedure. 


Conclusion 

Can  hospitalization  rates  be  reduced  safely?  Our  studies 
in  many  parts  of  the  country  show  that  the  areas  perceived 
to  have  high  quality  of  care — that  is,  areas  served 
primarily  by  university  or  teaching  hospitals — tend  to  have 
low  use  rates  compared  to  other  hospitals  in  their  own 
states.  This  is  borne  out  by  Figure  5,  which  shows 
admission  rates  in  hospital-service  areas  served  primarily 
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by  teaching  hospitals.  It  would  seem  that  university  and 
teaching  institutions  generally  practice  a  conservative 
pattern  of  hospital  admissions  in  their  communities.  Such 
information  can  help  to  persuade  other  physicians  that  low 
hospitalization  rates  are  not  incompatible  with  high- 
quality  medical  practice. 

Can  epidemiologic  monitoring  be  done  routinely?  Yes. 
Figures  6  through  9  show  some  of  the  results  of  our  study 
using  the  SPARCS  data  base  in  New  York  State  to  analyze  all 
hospital  discharges  in  1985  and  1986  by  small  area.  The 
chart  gives  you  a  portion  of  the  western  upstate  region, 
showing  observed  over  expected  (expected  as  the  state 
average)  admissions  rate  for  each  service  area.  The  data 
can  be  broken  out  by  case  mix,  and  it  can  show  time  trends. 

An  important  point  about  this  study  is  its  simple  format. 
It  can  be  readily  grasped  by  physicians,  purchasers,  and 
policy-makers.  Epidemiologic  research  is  valueless  unless 
it  is  communicated  in  a  way  which  people  who  are  in  a 
position  to  make  decisions  can  understand.  It  must  be 
information  suited  to  action. 


*        *  * 

Dr.  Philip  Caper  is  a  Professor  of  Public  Policy  at 
Dartmouth  Medical  School  and  a  Lecturer  in  Social  Medicine 
and  Health  Policy  at  Harvard  University's  School  of 
Medicine.  He  is  also  President  of  the  Codman  Research 
Group,  Inc.  Dr.  Caper  has  served  on  a  number  of  national 
advisory  committees.  He  was  a  member  of  the  National 
Council  on  Health  Planning  and  Development  from  1977  to 
1984,  and  served  as  its  Chairman  from  1980  to  1984. 
Earlier  in  his  career  he  served  as  professional  staff 
member  of  the  Senate  Labor  and  Human  Resources  Subcommittee 
on  Health. 
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Central  MoineUAreo  :  1980-1985 
Four  Pediatric  Medical  DRGs 

■  actual  □  expected  (based  on  state  average) 
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0  I  i  i  _j_  i  i  

1980        1981         1982        1983        1984  1985 
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Figure  2:    Tonsi L Lectomy  Rate  in  Maine,  1969  -  1977 
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Figure  3 


c. 


WARY  FINDINGS  -  Prostatectomy  Studies 

Death  rates  remain  elevated  following 
hospitalization 

1.  60%  of  deaths  within  3  months  occurred 
after  discharge 

2.  average  TURP  mortality  at  90  days  was  3.7% 


Mortality  rates  vary  among  hospitals 

1.  hospitals  with  <150  beds  have  an  odds-ratio 
of  death  1.8  times  higher  within  3  months  of 
surgery 

2.  mortality  rates  among  hospitals  vary  3-fold 


Probability  of  re-operation  within  8  years 
varies  by  type  of  surgery 


Cancer 
JUEP  33% 
Suprapubic  25.4% 


Non-cancerous 
20.2% 
10.1% 
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Figure  4 
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Hospital  Market  Areas  Served  by 
University/Teaching  Hospitals 

Ratio  to  State  Rate 


Medical  Surgical 

Iowa  City,  IA  .83  1.19 

Morgantown,  WV      .84  .78 

Rochester,  NY         .66  .83 

Hanover,  NH  .72  .86 

Burlington,  VT        .94  .96 
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Figure  6 
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Figure  7 
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Figure  8 


Ohs.  /  Exp. 


Niagara  East 
Niagara  Southw 
Niagara  Northw 
Buffalo  West 
Buffalo  East 
Rochester  SW 
RochesterNorth 
Rochester  SE 
Rochester  East 


5x  0>E 
Ni  S « Di 
5x  0<E 


sions 
1985 


W/////M 

1  W/////////////. 



\w////ff//////m 

1  i 

1.57 
1.39 
1  ■  21 
JL ■ 16 
1 1 13 
.66 
d  62 
.52 
.52 


n    o'  &  1  ■  3  9 

.52         1.00  1.93 


5-13 


Figure  9 
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MORTALITY  DATA 

What  are  the  strengths  and  limits  of  mortality  as  an 
outcome  measure  of  health  care  quality? 

*  Mortality  measures  can  be  misleading  because 
of  the  "signal  to  noise  problem,"  which 
produces  differences  in  results  that  are 
much  smaller  than  those  of  traditional 
health  services  research  methods. 

*  One  approach  to  solving  this  problem  is  to 
study  cases  where  mortality  can  be  affected 
by  factors  under  hospital  administrators ' 
control,  to  standardize  as  much  as  possible, 
and  to  make  extensive  use  of  co-variates. 

*  Until  the  signal  to  noise  problem  can  be 
definitively  resolved,  it  would  be  unwise  to 
rely  solely  on  mortality  measures  of 
outcome . 


By  Gary  Gaumer,  Ph.D. 

The  limitations  of  mortality  data  drawn  from  Medicare  or 
other  insurance  systems  are  well  known.  Mortality  as  an 
indicator  of  quality  has  three  major  shortcomings.  First, 
and  probably  most  important,  mortality  is  a  fairly  blunt 
measure  of  quality.  Quality  can  deteriorate  significantly, 
particularly  for  some  types  of  cases,  before  survival  is 
threatened.  Second,  the  available  data  does  not  permit 
standardization  of  mortality  measures  to  be  very  well  done 
because  of  the  absence  of  clinical  findings  on  the  stay 
record.  The  third  problem  is  that  the  data  are  widely 
believed  to  be  full  of  errors. 

Still,  it  seems  that  three  factors  argue  in  favor  of  using 
these  data  to  evaluate  health  care  programs.  First,  death 
and  preventable  death  remain  an  unambiguously  important 
measure  of  outcome.  Second,  the  data  are  available  in  very 
large  samples,  on  all  U.S.  hospitals,  and  are  less 
expensive  to  use  for  both  monitoring  and  evaluation 
purposes  than  data  drawn  from  clinical  records.  And  third, 
the  statistical  advantages  of  large  samples  arguably  allow 
us  to  dominate  the  problems  of  small  effects,  blunt 
measures,  and  noisy  data  in  isolating  major  quality 
problems. 

The  key  to  using  the  data  is  to  solve  what  I  will  call  the 
"signal  to  noise  problem".      Adverse  mortality  effects  are 
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likely  to  be  much  smaller  than  the  effects  we  commonly  look 
at  with  traditional  health  services  research  methods,  and 
there  are  extreme  differences  in  hospital  mortality  rates 
across  patient  types  and  across  hospitals.  We  are  often 
looking  for  the  proverbial  needle  in  a  haystack. 
Statistically,  this  means  we  tend  to  reject  small  but  true 
effects — with  results  that  may  not  only  be  misleading,  but 
may  also  lead  policy-makers  into  dangerous  complacency.  It 
seems  to  me,  therefore,  that  researchers  would  be  wise  not 
to  rely  on  mortality  data  alone  as  an  indicator  of  quality. 


Filtering  out  the  Noise 

The  tactics  we  have  used  to  cope  with  the  signal  to  noise 

problem  are: 

*  Selection  of  sensitive  case  types. 

*  Indirect  standardization. 

*  Large  samples. 

*  Use  of  co-variates. 

*  Relaxed  statistical  criteria. 

The  basic  idea  is  to  try  to  design  around  the  section  of 
the  haystack  where  the  needle  is  most  likely  to  be  found, 
by  using  standardization  and  co-variates  to  reduce  the 
error  variation  and  to  better  detect  small  effects.  We 
have  also  tended  to  use  relaxed  statistical  criteria  so 
that,  when  we  do  find  the  needle,  we  do  not  reject  it  out 
of  hand.  This  means  that  our  studies  are  to  some  extent 
biased  in  favor  of  finding  adverse  effects. 

To  illustrate,  I  have  chosen  one  of  three  large  data  sets 
that  we  constructed  to  look  at  mortality.  We  used  2,200 
hospitals,  10  years  of  data  ending  in  1983,  and  815,000 
cases  drawn  from  Medicare  records. 

We  think  that  a  significant  step  in  limiting  the  signal  to 
noise  problem  is  to  select  cases  to  study  where  mortality 
is  not  a  blunt  measure  of  outcome.  Specifically,  we  set 
out  to  find  a  set  of  cases  for  which  mortality  could 
reasonably  be  associated  with  aspects  of  structure  and 
process  that  were  under  administrative  control,  and  where 
life  could  be  threatened  by  rationing  or  cost  containment. 

Researchers  at  the  University  of  Michigan  took  the 
following  steps  to  select  these  cases,  using  panels  of 
clinicians  and  hospital  administrators: 
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1.  Determining  the  avenues  of  administrative  response  to 
intervention  most  likely  to  affect  the  process  of 
care. 

2.  Determining  which  principal  diagnoses  and  procedures 
rely  most  on  aspects  of  the  care  process  involving 
significant  administrative  discretion. 

3 .  Ranking  these  diagnoses  and  procedures  according  to 
the  likelihood  of  serious  adverse  outcomes  stemming 
from  deficiencies  in  aspects  of  care  under 
administrative  control . 

The  result  is  the  list  of  59  urgent  care  cases  shown  in 
Figure  1.  These  cases  represent  about  10%  of  Medicare 
admissions. 

The  next  problem  was  to  design  a  set  of  fatality  measures. 
We  decided  to  use  measures  that  were  based  around  the  point 
of  admission.  The  alternatives  were  to  use  mortality  rates 
keyed  to  the  status  at  discharge — not  a  good  choice  because 
of  inadequate  Medicare  data  and  the  measure's  sensitivity 
to  length  of  stay  policies — or  rates  keyed  to  the  onset  of 
illness.  Obviously,  the  latter  data  are  not  derivable  from 
Medicare  claims  information.  In  addition,  I  suspect  the 
measure  would  test  the  quality  of  the  medical  management 
process  rather  than  being  sensitive  to  the  factors  hospital 
administrators  can  control . 

The  second  measurement  issue  is  the  choice  of  a  period  of 
follow-up  from  admission.  Our  view,  based  on  the  advice  of 
clinicians,  was  that  most  of  the  quality  or  outcome  threats 
should  appear  rather  quickly.  These  effects  would  then 
tend  to  persist,  but  the  short-period  measures  would  be 
better  than  the  long-period  measures  because  more 
confounding  influences  enter  the  picture  over  time.  We 
therefore  preferred  to  use  shorter-period  rates,  such  as 
30-day  fatality  rates. 

There  is  an  indication,  interestingly,  that  these  fatality 
rates  are  indeed  sensitive  to  the  level  of  spending  per 
case.  The  correlation  is  very  small,  and  there  is  a  lot  of 
variation  in  the  data.  But  there  is  a  suggestion,  in  the 
data  shown  in  Figure  2,  that  we  have  identified  cases  where 
mortality  is  sensitive  to  resource  use  in  hospitals. 

In  Figure  3,  the  left-hand  panel  is  a  distribution  of 
unadjusted  30-day  mortality  rates  for  these  cases  for  1974. 
The  right-hand  panel  shows  the  same  statistics  for  1983, 
and  the  column  indicates  categories  of  unadjusted  mortality 
rates.     This  chart  shows  three  things.     First,   there  is  an 


6-3 


extreme  variation  across  hospitals  in  average  mortality  for 
these  cases.  Second,  the  distribution  change  from  1974  to 
1983  suggests  a  major  downward  trend  in  mortality,, 

The  most  important  thing  the  chart  says,  however,  is  that 
there  has  been  a  very  large  reduction  in  the  frequency  of 
high  mortality  hospitals  serving  the  Medicare  population. 
In  1974,  28%  of  the  Medicare  beneficiaries  with  these 
conditions  entered  hospitals  where  the  risks  of  dying 
within  30  days  were  greater  than  1  in  4 .  By  1983, 
approximately  5%  of  Medicare  beneficiaries  entered 
hospitals  where  the  risks  were  that  high.  There  would  seem 
to  have  been  some  remarkable  changes  in  mortality 
experience  in  the  Medicare  program. 


Other  Problems 

Standardization  is  another  factor  to  be  considered  in 
solving  the  signal  to  noise  problem.  We  have  attempted  to 
standardize  as  best  we  can;  Figure  4  illustrates  our 
approach  to  indirect  standardization,  which  is  admittedly 
crude.  We  could  not  include  co-morbidity  in  the  data  set 
because  those  data  were  inadequate  in  Medicare  files  for 
many  of  the  years  we  wanted  to  study. 

We  have  also  had  to  use  co-variates  in  an  attempt  to 
eliminate  as  much  variation  as  possible.  The  next  few 
charts  present  data  for  profit  and  non-profit  hospitals. 
First  we  will  look  at  relative  performance  results  using 
our  mortality  measures  alone,  and  then  we  will  see  the 
effect  of  using  co-variates. 

Figure  5  shows  unadjusted  30-day  mortality  rates  for  our  59 
case  types  in  proprietary  hospitals  and  non-profit 
hospitals.  There  is  scarcely  any  difference.  Adjusting 
and  standardizing  the  rates,  and  taking  the  ratio  of  actual 
to  expected  mortality,  does  not  improve  one's  insight  much. 
(See  Figure  6.)  It  only  shows  that  between  1978  and  1979, 
ICDA-8  changed  to  ICD-9-CM,  which  creates  a  spike  in  the 
data . 

We  then  tried  regression  models  that  provided  for  a  number 
of  co-variates  having  to  do  with  the  hospital's  mission, 
the  characteristics  of  the  catchment  area,  regulatory 
programs  that  might  have  contributed  to  mortality  rates, 
and  influenza  rates  for  the  state.  We  also  included  a 
measure  of  base  period  standardized  mortality  for  each 
hospital.  The  results  in  Figure  7  show  that  proprietary 
hospitals  do  in  fact  have  a  higher  mortality  rate  than  non- 
profits. There  is  also  evidence  that  teaching  hospitals  do 
better  than  non-teaching  hospitals,    and  evidence  that  the 
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PSRO  program  and  the  certificate  of  need  program  both 
improve  survival . 

Finally,  Figure  8  adds  the  hospital's  average  cost  per  case 
to  the  previous  model.  There  is  a  significant  relationship 
in  the  direction  one  would  expect,  suggesting  that  for 
every  $1,000  reduction  in  hospital-wide  expense  per  case, 
there  is  a  1.6%  increase  in  the  standardized  mortality  rate 
for  these  fragile  cases. 


Suggestions  for  Future  Research 

Our  work  does  suggest  that  by  focusing  on  fragile  cases, 
where  administrators  control  matters  that  relate  to 
survival,  the  signal  to  noise  problem  might  be  partially 
relieved.  But  the  work  we've  done  is  very  preliminary,  and 
the  methodology  has  never  been  validated.  I  would  like  to 
see  additional  investment  in  developing  a  set  of  case  types 
that  could  be  used  for  routine  monitoring  of  mortality  as 
it  is  threatened  by  resource  constraints  on  hospital 
managers . 

In  addition,  studies  of  cost  and  mortality  tradeoffs  are 
surprisingly  rare,  yet  they  lie  at  the  heart  of  public 
policy.  The  tradeoff  they  imply  is  a  frightful  dilemma  for 
policy-makers  and  public  program  managers,  and  the  topic 
deserves  much  more  attention  than  it  gets.  There  is 
definitely  a  case  to  be  made  for  more  research. 

Finally,  there  are  serious  issues  to  be  resolved  in  making 
use  of  these  mortality  data.  They  have  to  do  with 
sharpening  our  focus  on  severity  of  prognosis.  In  the 
absence  of  a  good  standardization  technology,  and  in  view 
of  the  signal  to  noise  problem,  mortality  rates  must  remain 
indicative;  they  are  certainly  not  definitive  outcome 
measures.  Sole  reliance  on  these  measures  is  quite 
worrisome  because  of  the  possibility  that  one  will  be 
accepting  a  null  hypothesis  and  ignoring  true  effects. 

*        *  * 

Dr.  Gary  Gaumer  is  an  economist  and  Vice  President  of  ABT 
Associates,  Inc.,  of  Cambridge,  Massachusetts.  He  has  led 
extensive  research  for  the  federal  government  relating  to 
the  impact  of  state  and  federal  prospective  payment  systems 
on  hospitals,  and  in  particular,  the  impact  on  quality  of 
care. 
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Figure  1 

CARE  SENSITIVE  CONDITIONS 


i 


Group  I:   Urgent  Care  Diagnoses 

Surgical  Emergencies 

ICDA-8 
Number 


1. 

423 

cardiac  tamponade 

2. 

441 

aortic  aneurysm 

3. 

560.2 

volvulus 

4, 

530.4, .9 

rupture,  perforation  of  esophagus 

5. 

531.1 

ulcer  of  stomach  with  perforation 

6. 

532.1, .2 

ulcer  of  duodenum  with  perforation 

7. 

569.4, .9 

perforation,  ulceration  of  intestine 

8. 

533.0 

peptic  ulcer  with  perforation 

9. 

560.4 

intestinal  adhesions  with  OBSTRUCTION 

10. 

567.9 

PERITONITIS,  UNSPECIFIED 

11. 

540.9 

APPENDICITIS,  ACUTE 

Potential  Surgical  Emergencies 

12. 

456.0 

VARICOSE  VEINS  OF  ESOPHAGUS  WITH  HEMORRHAGE 

13. 

531.0 

ULCER  OF  STOMACH  WITH  HEMORRHAGE  ONLY 

14. 

532.0 

ULCER  OF  DUODENUM  WITH  HEMORRHAGE  ONLY 

15. 

537.0 

PYLORIC  STENOSIS,  ACQUIRED 

16. 

450 

PULMONARY  EMBOLISM 

17. 

933,934 

FOREIGN  BODY  IN  PHARNYX  AND  LARNYX 

18. 

574.0 

CHOLELITHIASIS  WITH  ACUTE  CHOLECYSTITIS 

19. 

575 

CHOLECYSTITIS  WITH  CHOLANGITIS  WITHOUT 

MENTION  OF  CALCULUS 
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ICDA-8 
Number 


Group  1:  Urgent  Care  Diagnoses  (Continued) 
Medical  Emergencies 


20. 

780.0 

COMA 

21. 

573.9 

HEPATIC  COMA 

oo 

22. 

347.9 

Reye  s  syndrome 

23. 

580 

ACUTE  RENAL  FAILURE  (UREMIA) 

24. 

593.2 

acute  kidney  (tubular)  necrosis 

25. 

570 

ACUTE  NECROSIS  OF  LIVER 

26. 

577,0 

ACUTE  PANCREATITIS 

27. 

410 

ACUTE  MYOCARDIAL  INFARCTION 

28. 

242 

THYROTOXICOSIS  (THYROTOXIC  STORM) 

29. 

400 

MAGLIGNANT  HYPERTENSION 

30. 

482.3 

STAPHYLOCOCCAL  PNEUMONIA 

Potential  Medical  Emergencies 

31, 

427.3 

HEART  BLOCK  (COMPLETE) 

32. 

427.4 

PAROXYSMAL  ATRIAL  FIBRILLATION 

33. 

427.5 

PAROXYSMAL  ATRIAL  TACHYCARDIA 

34. 

345.1, .9, 

780.2 

GRAND  MAL  SEIZURE 

35, 

391 

RHEUMATIC  FEVER  WITH  HEART  INVOLV 

36. 

427.1 

LEFT  VENTRICULAR  FAILURE 

37. 

427,0 

CONGESTIVE  HEART  FAILURE 

38. 

582 

UREMIA  (CHRONIC  RENAL  FAILURE) 
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I CDA-8 
Number 

Group  1:   Urgent  Care  Diagnoses  (Continued) 
Intracranial  Emergencies 

39.  852         subarachnoid,  subdural,  and  extradural 

hemorrhage  following  injury 

40.  851  cerebral  laceration  and  contusion 

41.  854  intracranial  injury  of  other  and 

unspecified  nature 

42.  442         aneurysh  of  brain  (with  rupture) 

43.  430         subarachnoid  hemorrhage 

44.  431         cerebral  hemorrhage 

Traumatic  Injury  to  Vital  Organs 

45.  860-869.9  traumatic  injury  to  vital  organs 
Serious  Fractures 

46.  806. 0-. 7  fracture  of  vertebral  column  with  spinal 

cord  lesion 

47.  804.0, .1   multiple  fractures  involving  skull  or 

face  with  other  bones 

48.  809.0,1,1  multiple  and  ill-defined  fractures  of 

TRUNK 

Potentially  Serious  Fractures 

49.  800-801.1  fracture  of  skull 

50.  808.0, . 1   fracture  of  pelvis 
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ICDA-8 
Number 

Group  1:   Urgent  Care  Diagnoses  (Continued) 
Serious  Burns 

51.  942.948    burns  of  face,  trunk,  and  limbs 

52.  992.0-5     heat  problems 
Emergencies  Due  to  Allergic  Reactions 

53.  999.4,      anaphylactic  shock 
989.4 

54.  493         status  asthmaticus  (asthma) 
Complex  Diagnostic  Entities 

55.  191,238.1  brain  tumors 

56.  322         brain  abscess 

57.  158.0       malignant  neoplasm  of  retroperitoneal 

TISSUE 

58.  682.1       retroperitoneal  abscess 

59.  197,230,   secondary  and  unspecified  malignant  neo- 
231  PLASM  of  respiratory  and  digestive  systems 
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Figure  3 
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Figure  4 

STANDARDIZATION 


354  CELLS  USED  FOR  INDIRECT  STANDARDIZATION 
59  DX  GROUPS 

3  age  groups  (65-74,  75-84,  85+) 
2  sex  groups 

Expected  facility  rates  for  each  cell  is  national 
average  rate  for  a  20%  sample  of  cases  for  a  25% 
srs  of  us  hospitals 


Actual  and  expected  rates  aggregated  to  hospital 
year  level 


Ratio  of  actual  to  expected  mortality  is  used  in 
analysis 


ACTUAL  MORTALITY  RATE 
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What  factors  need  to  be  considered  in  the  construction  of  a 
measure  of  severity? 

*  Discharge  abstract  data  bases  are  not  valid  data 
bases  for  evaluating  hospital  quality  of  care  or 
setting  public  standards  for  hospital  death 
rates. 

*  Quality  of  care  seems  likely  to  decline  in 
response  to  inaccurate  public  accountability 
information. 

*  Physiologic  measures  obtained  from  standard 
automated  blood  tests  could  be  used  to  develop  a 
more  reliable  measure  of  severity  of  illness  for 
all  medical  and  surgical  patients.  This  type  of 
measure  would  have  a  long-run  marginal  cost  of  1 
to  20  cents  per  patient. 

Variations  in  Outcome  and  Hospital  Response 

Douglas  P.  Wagner,  Ph.D. 


I  am  deeply  concerned  about  the  apparent  excess  death  rates 
observed  in  some  hospitals  and  welcome  a  new  age  of 
accountability  for  hospital  care.  Just  as  hospital  cost 
control  was  long  overdue,  so  too  is  accountability  for 
quality  of  care  and  outcome.  Extensive  research  on  volume 
and  outcome  by  Luft  et.al,  Flood  et.  al.,  Riley  &  Lubitz, 
Sloan  et.al,  as  well  as  clinical  studies  of  CABG,  the 
national  Burn  Registry  studies,  and  our  own  studies  of 
intensive  care  units,  have  suggested  or  demonstrated 
important  quality  differences  across  hospitals.  (1-8) 

The  problem  with  studies  based  on  discharge  abstracts 
alone (1-4),  however,  is  that  one  cannot  determine  to  what 
extent  the  differences  in  outcome  are  caused  by  differences 
in  1) patient  selection  and  referral,  2) severity  of  acute 
illness,  3) coding,  4) variations  in  speed  of  discharge,  or 
5) quality  of  care.  The  advantages  of  the  clinical  studies 
(Burn,  CASS,  APACHE)  is  that  they  have  had  careful 
prospective  measurement  of  severity  of  illness,  diagnoses, 
and  therapeutic  approaches.  Thus  they  can  more  strongly 
rule  out  severity,  patient  selection  and  coding 
differences,  and  2  of  the  clinical  studies  had  direct 
evidence  of  bad  therapy  besides  the  excess  death  rates.  As 
a  result,  clinical  process  has  changed. 


The    measurement    reliability    necessary    for    any    case  mix 
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system  depends  upon  the  use  and  the  potential  behavioral 
response  to  that  use.  Diagnostic  coding  is  sufficiently 
accurate  to  use  as  a  screening  device  to  select  hospitals 
for  further  in-depth  study  and  review.  The  in-depth  review 
would  have  to  look  for  evidence  of  bad  process  as  well  as 
for  evidence  of  adverse  case  selection. 

The  publication  of  "consumer  reports"  listing  "expected" 
and  observed  hospital  death  rates,  however,  seems  likely  to 
be  harmful  to  the  quality  of  United  States  hospital  care 
because  of  the  widespread  publicity  those  numbers  will 
receive,  and  the  effort  hospitals  will  be  forced  to  expend 
to  avoid  the  adverse  publicity. 

There  are  at  least  seven  potential  responses  by  hospitals 
to  the  publication  of  their  "expected"  and  observed  death 
rate  on  page  one  of  national  and  local  newspapers: 

1.  Improve  quality  of  care  by  either: 

a.   identifying  and  expelling  suspect  physicians 

or 

b.   improving  the  quality  of  nursing  and  other 
patient  care  services. 

2.  Inhibit  the  development  and  diffusion  of  new 
surgical  procedures. 

3 .  Upcode . 

4.  Limit  the  admission  of  the  severely  ill. 

5.  Increase  the  admission  of  low  severity  patients. 

6.  Discharge  patients  to  die  elsewhere  (VA,  other 
hospitals,  home,  nursing  homes,  hospice). 

7.  Spend  substantial  resources  to  "prove"  a  hospital's 
case  mix  is  more  severely  ill  than  the  average. 

Unfortunately  the  last  six  items  on  this  list  are  all  far 
easier    to    do,     technically    and    bureaucrat ically,  than 

improving  the  quality  of  care. 

One  fundamental  problem  in  evaluating  hospital  death  rates 
is  that  in  most  diagnostic  categories  the  vast  majority  of 
patients  are  at  a  very  low  risk  of  death,  particularly  in 
surgical  DRGs,  with  a  small  percentage  of  the  population 
who  are  severely  ill  and  at  a  much  higher  risk  of  death. 
If  the  small  group  of  severely  ill  patients  are  unequally 
distributed,  and  not  adequately  measured  in  the  setting  of 
standards,  the  conclusions  regarding  quality  of  care  can  be 
grossly  wrong.      This  was  most   recently   illustrated   in  a 
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brief  analysis  of  1500  consecutive  abdominal  surgery 
patients.  (9) 

Another  serious  public  policy  problem  with  the  publication 
of  "adjusted"  death  rates  by  hospital  is  that  most  of  the 
deaths  are  among  non-operative  patients,  for  whom  the 
adequacy  and  accuracy  of  the  coding  is  most  suspect,  while 
all  but  one  of  the  inter-  hospital  variations  studies  have 
focused  only  on  post-operative  patients.  The  non-operative 
patients  are  the  same  group  for  whom  the  DRG  system 
explains  almost  none  of  the  inter-patient  variation  in 
charges . 

The  uncertainty  about  the  causes  of  the  differences  in 
outcome  in  the  discharge  studies  is  heightened  by  the 
important  and  widely  replicated  results  on  small  area 
variations  reported  by  Wennberg,  et.  al.,  and  Roos, 
et.al. (10, 11)  The  high  frequency  hospitals  in  the  surgery 
volume  studies  might  merely  be  the  hospitals  with  the  most 
aggressive  surgeons  operating  on  many  healthy  patients. 

Virtually  every  hospital  and  physician  believes  it  provides 
"good"  quality  care,  and  will  reject  feedback  to  the 
contrary  unless  it  is  documented  with  incontrovertable 
evidence.  Every  hospital  and  physician  knows  that  mistakes 
are  occasionally  made,  but  there  is  little  knowledge  of  the 
frequency  and  impact  of  the  mistakes,  partly  because  the 
severe  financial  penalties  from  malpractice  suits 
discourages  documentation  of  mistakes.  I  believe  that  the 
cause  of  a  difference  between  a  4  and  a  6%  death  rate  can 
not  be  proved  by  discharge  abstracts  alone. 

Therefore,  discharge  abstract  data  bases  are  not  valid  data 
bases  for  evaluating  hospital  quality  of  care  or  setting 
public  standards  for  hospital  death  rates.  They  are  not 
reliable  enough  as  indicated  by  the  recent  substantial 
increase  (20  to  40%)  in  the  coding  of  secondary  diagnoses 
among    large    Medicare    data    bases.  More  importantly, 

discharge  diagnoses  do  not  contain  sufficient  information 
about  patient's  acute  risk  of  death. 

Within  virtually  any  disease  there  are  many  patients  who 
are  not  hospitalized,  others  who  receive  brief  and 
uneventful  courses  of  hospital  care,  and  others  who  are 
desparately  ill.  The  human  body  is  extraordinarily  complex 
and  medical  knowledge  is  extraordinarily  detailed.  Medical 
care  is  the  art  of  making  difficult  prospective  decisions 
under  risk  and  uncertainty,  while  carefully  minimizing  the 
risk  of  avoidable  death  or  loss  of  function.  There  is 
uncertainty  about  what  is  wrong  with  the  patient  and  the 
diagnosis  is  the  best  guess.  There  is  risk  that  even  well 
done  treatment  will  cause  unintended  side  effects,   as  well 
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as  the  risk  that  a  therapy  might  cause  patient  harm  because 
of  poor  execution. 

Validity  and  Reliability 

The  validity  of  any  case  mix  system  depends  on  measurement 
reliability  and  predictive  accuracy.  These  are  the 
measurement  reliability  issues  we  thought  about  when  we 
designed  the  APACHE  system  9  years  ago: 

1.  What  is  the  inter- judge  reliabilty  across  data 
collectors? 

2.  How  big  is  the  impact  of  a  single  data  error  in 
predicting  death  rates? 

3.  How  sensitive  are  the  variables  to  variations  in 
physician  practice  style  or  aggressiveness?  Would 
a  hospital  increase  its  "predicted"  death  rate 
more  than  its  observed  death  rate  by  using 
ventilators  or  cardiac  catheters  more 
aggressively?  How  will  varying  financial 
incentives  affect  the  use  of  these  technologies 
and  vice  versa? 

4 .  How  robust  would  the  measures  be  to  incentives  to 
creep?  Published  hospital  death  rates  will  provide 
far  stronger  incentives  to  creep  than  the  limited 
financial  incentives  under  the  DRG  system. 

5.  What  is  the  bad  outcome  rate  relative  to 
measurement  error  and  selection  bias?  If  an  event 
is  rare,  measurement  reliability  and  precision 
must  be  extraordinarily  accurate. 

Formal  scientific  studies  of  instrument  reliability  rarely 
do  more  than  address  the  first  of  these  measurement 
problems,  while  valid  public  policy  use  of  measurement 
systems  requires  a  careful  analysis  of  all  5  potential 
problems  with  measurement  reliability. 

Acute  Physiology  and  Chronic  Health  Evaluation (APACHE  II) 

To  prevent  measurement  bias  in  APACHE,  particularly  due  to 
variations  in  physician  aggressiveness,  we  developed  a 
conceptual  model  of  the  determinants  of  hospital  outcome 
(FIGURE  1)  .  We  used  the  model  to  divide  the  realm  of 
possible  measurements  into  those  we  wanted  and  those  we  did 
not  want  to  use. (12)  Specifically,  Figure  1  shows  patient 
factors  before  treatment,  and  the  items  listed  under  that 
category  are  those  we  wanted  to  measure  as  well  as 
possible.       In   addition   to   admitting   disease   process,  we 
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wanted  to  measure  the  acute  severity  of  illness  with  highly 
reproducible  measures.  So  we  used  chemical  blood  tests  and 
vital  signs,  factors  that  have  a  precision  of  measurement 
approaching  the  precision  one  finds  in  the  physical  and 
biologic  sciences.  Partly  because  of  measurement  precision 
and  partly  because  of  the  predictive  power,  the  physiologic 
measures  used  in  APACHE  II  are  the  centrally  important  part 
of  our  analyses. 


FIGURE  1 

DETERMINANTS  OF  OUTCOME  FROM  ACUTE  ILLNESS 


0  PATIENT  FACTORS  PRIOR  TO  TREATMENT 

1.  PRIMARY  ADMITTING  DISEASE 

2.  ACUTE  SEVERITY  OF  ILLNESS 

3.  PHYSIOLOGIC  RESERVE 

0  POST  TREATMENT  INFORMATION 

4.  TREATMENTS  USED 

5 .  TIMING 

6 .  PROCESS 

7.  RESULTS  OF  DIAGNOSTIC  TESTS 

8.  RESPONSE  TO  THERAPY 


We  also  knew  that  something  else  besides  acute  severity  of 
illness  mattered.  Older  patients  with  advanced  chronic 
conditions  have  less  vital  capacity  to  withstand  and 
survive  the  acute  insults  of  disease,  than  patients  who  are 
younger,  or  patients  who  do  not  have  severe  chronic 
diseases.  We  have  quantified  a  patient's  physiologic 
reserve  with  measures  of  chronologic  age  and  the  presence 
or  absence  of  severe,  end  stage  chronic  disease.  We  would 
like  to  move  to  a  more  precise  measure  of  biologic  age  as  a 
mreasure  of  physiologic  reserve,  and  are  in  the  process  of 
exploring  better  measurements  in  this  area.  But  finding 
the  best  measurement  of  physiologic  reserve  will  take  5  to 
10  years  or  more.  In  the  meantime,  APACHE  II  does  include 
some  important  aspects  of  physiologic  reserve,  and  they 
reflect  the  basic  clinical  perception  that  an  acute  illness 
can  have  a  different  outcome  depending  on  the  patient's 
status  before  its  onset. 

The  bottom  half  of  Figure  1  lists  the  factors  which  must  be 
excluded  from  a  severity  of  illnesss  measurement  system. 
Since  the  primary  purpose  of  the  severity  measurement  is  to 
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evaluate  the  appropriateness  of  therapy  and  clinical 
judgement,  the  measurement  itself  must  be  independent  of 
those  therapies  and  judgements.  Explanatory  power  is 
spuriously  high  when  the  same  variable  appears  on  both 
sides  of  a  regression  equation. 

We  are  now  working  with  APACHE  II,  which  we  designed  for 
use  with  intensive  care  patients.  In  this  system  we  assign 
points  for  12  physiologic  variables,  which  we  add  up  into 
an  acute  physiology  score  (APS) .  We  also  assign  points  for 
age  and  chronic  health  status,  and  the  three  elements 
combined  make  up  an  APACHE  II  score.  The  complete  details 
on  the  measurement  of  APACHE  II  have  been  widely  published. 
(7,8,13)  Eighty  percent  of  the  points  are  allocated  to  the 
APS.  I  want  to  stress  that  the  weights  given  to  the  age 
and  chronic  health  elements  are  not  appropriate  for  the 
whole  hospital  case  mix;  they  are  specifically  weighted  for 
the  intensive  care  unit  population.  The  weights  within  the 
APS  component  of  APACHE  II  probably  are  appropriate  for  all 
hospitalized  patients,  because  of  a  long  history  of 
research  in  the  biochemistry  and  pathophysiology  of  human 
disease.  There  is  no  comparable  history  of  hard  science 
research  underlying  the  chronic  health  measurements. 

The  method  of  weighting  the  12  physiologic  variables  in  the 
APS  is  illustrated  in  Figure  2,  which  shows  the  weights  for 
one  of  the  variables  —Serum  Sodium.  The  principle  is  that 
if  the  physiologic  measurement  is  within  the  normal  range, 
the  variable  receives  zero  weight?  as  the  measure  deviates 
from  the  normal  range  in  either  direction  '  the  weight 
increases.  The  decisions  about  where  to  draw  the  lines  and 
how  much  weight  to  assign  to  different  levels  of 
derangement  in  the  physiologic  parameters  are  based  on  the 
clinical  judgement  of  a  multi-disciplinary  group  of 
experienced  ICU  physicians. 

It  is  important  to  note  that  the  increases  in  weight  in 
Figure  2  are  non-linear  and  non-symmetric  around  the  normal 
range.  The  fact  that  the  slope  is  somewhat  lower  for  high 
sodium  than  for  low  sodium  simply  reflects  biochemistry. 
It  is  very  difficult  and  expensive  to  use  multivariate 
statistical  techniques  to  try  to  empirically  define  the 
optimum  functional  forms  and  weights  out  of  an  existing 
data  base.  This  is  because  of  the  dichotomous  dependent 
variable  with  80%  survival  rates  and  the  wide  range  of 
potential  interactions  and  non-linear  relationships  among 
physiologic  and  diagnostic  variables.  You  cannot  use 
statistical  search  techniques  unless  you  have  a  very,  very 
large  sample,  with  extreme  attention  to  accuracy  of  data 
and  to  influential  outliers.  It  is,  therefore,  much  more 
efficient  to  start  with  reasonably  agreed  upon  clinical 
judgements,  systematically  transformed  into  weights. 
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Physiologic  Severity,  Diagnosis, 
and  Risk  of  Hospital  Death 

Turning  to  data,  Figures  3,  4,  and  5  illustrate  that 
disease  alone  is  not  enough  to  accurately  stratify  patients 
by  risk  of  death. 

Data  published  in  the  Archives  of  Internal  Medicine  (14) 
show  six  separate  diseases  with  six  separate  bar  charts. 
Within  each  of  these  diseases  acute  physiologic 
derangement,  as  measured  across  the  horizontal  axis  by  the 
APS  within  24  hours  of  ICU  admission,  is  very  strongly 
correlated  with  risk  of  observed  hospital  death  measured  on 
the  vertical  axis.  If  diagnosis  alone  were  a  sufficient 
predictor  of  hospital  death  rates,  then  each  of  these  six 
figures  would  have  a  horizontal  line  describing  the 
relationship  between  physiologic  derangement  and  hospital 
death  rate.  The  same  non-horizontal  relationship  for  17  of 
18  specific  diseases  is  demonstrated  in  the  article  in  the 
Archives  of  Internal  Medicine  (14) .  Unpublished  analyses 
have  replicated  this  to  a  total  of  35  of  36  disease 
categories . 

Moving  to  aggregate  data  across  disease  categories,  Figure 
3  illustrates  two  points. (15)  Patients  are  sorted  into 
increasing  APACHE  II  scores  (3  point  ranges)  on  the 
horizontal  axis  and  death  rates  are  measured  on  the 
vertical  axis.  The  first  of  the  2  bars  within  each  APS 
range  is  a  measure  of  the  observed  death  rate,  which  varies 
from  1.4%  for  patients  with  zero  to  two  APACHE  II  points, 
up  to  94%  for  patients  with  42  or  more  APACHE  II  points. 
Note  that,  despite  being  selected  for  admission  to 
intensive  care  units,  the  absence  of  substantial 
physiologic  derangement,  as  reflected  in  an  APACHE  II  score 
less  than  3,  implied  a  very  low  risk  of  dying  in  the 
hospital  for  these  patients.  Therefore,  a  patient  is  very 
unlikely  to  die  in  the  hospital  unless  he  has  some 
physiologic  derangement. 

The  second  set  of  bars  in  Figure  3  represents  a  set  of 
predicted  death  rates  based  primarily  on  diagnosis,  though 
age  and  chronic  health  measures  were  also  used  in  the 
predictions.  The  predictions  were  based  on  a  multiple 
logistic  regression  analysis  which  used  all  predictor 
variables  except  for  the  components  of  the  APS  of  APACHE 
II.  It  is  interesting  to  note  that  sorted  by  APACHE  II 
ranges,  the  predicted  death  rates  starts  at  11%  and  climbs 
only  to  40%.  The  predictions  using  disease  alone,  without 
the  APACHE  acute  physiologic  measures,  are  strongly 
compressed  toward  the  mean  death  rate  of  the  sample,  and 
they  are  very  inaccurate  for  patients  at  either  end  of  the 
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risk  spectrum.  Thus  a  hospital  which  overused  its 
intensive  care  units  by  admitting  many  low  severity 
patients  would  have  predicted  death  rates  substantially 
higher  than  observed,  if  the  predictions  were  based  only  on 
diagnosis  and  age.  At  the  opposite  end  of  the  spectrum,  a 
conservative  hospital  would  be  punished  by  a  conservative 
restriction  of  of  ICU  admission  to  only  those  at  higher 
risk. 

Figure  4  illustrates  what  happens  when  the  predictor 
variables  include  the  APS  of  APACHE  II  as  well  as  the 
diagnoses,  age,  and  chronic  health  measures. (15)  Note  that 
the  predicted  and  observed  death  rates  are  now  very  closely 
matched  throughout  the  range  of  severity  of  illness,  from 
the  1.4%  risk  of  death  at  the  bottom  up  to  the  94%  risk  at 
the  top. 

The  close  agreement  between  predicted  and  observed  death 
rates  with  APACHE  II  has  been  replicated  in  randomized 
split  halves  forecasts  as  well  as  forecasts  to  patient 
samples  from  other  countries.  Figure  5  illustrates  the 
accuracy  of  using  the  same  forecasting  equation  used  in 
Figure  4  but  applied  to  1005  consecutive  ICU  admissions  to 
3  of  the  5  tertiary  care  ICUs  in  New  Zealand. (16)  Despite 
a  very  different  patient  population  in  terms  of  age  and 
disease  distribution,  the  APACHE  II  forecasts  stratified 
the  New  Zealand  patients  very  well. 

My  concern  about  hospitals  denying  access  to  hospital  care 
for  severely  ill  patients  because  of  high  perceived  risk  of 
death  would  be  baseless  if  clinicians  could  not  tell  which 
patients  were  at  high  risk.  Physicians  or  nurses  can  do  a 
reasonably  accurate  job,  however,  of  stratifying  patients 
into  ascending  risk  groups.  Some  recent  results  published 
in  abstract  form  only  by  a  group  of  ICU  physicians  at  Wayne 
State  University  in  Detroit.  (17)  show  that  on  the  first 
day  in  the  ICU  the  physicians  (or  the  nurses)  could  do 
quite  well  at  predicting  the  risk  of  subsequent  death  among 
this  sample  of  325  ICU  patients. 

Implications  for  Quality  of  Care  Assessment. 

I  believe  that  a  global  measure  of  severity  of  acute 
illness  for  all  medical  and  surgical  patients,  similar  to 
the  Acute  Physiology  Score  of  APACHE  II  could  be  developed. 
The  severity  score  could  be  measured  at  hospital  admission 
as  a  measure  of  incoming  severity  of  illness,  and  the  same 
scale  could  be  measured  at  hospital  discharge  as  a  more 
sensitive  measure  of  outcome.  It  might  or  might  not  be  as 
useful  as  existing  chronic  health  measures,  but  it  would  be 
much  more  reliably  measured  and  cheaper  to  measure  by 
orders  of  magnitude. 
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How?  Automated  computerized  venous  blood  tests  (SMA-6 , 
SMA-  12,  and  CBC)  are  highly  accurate  measures  of  a  large 
portion  of  physiologic  derangement.  Most  patients  already 
have  these  blood  tests  done  repeatedly,  and  the  marginal 
cost  of  these  tests  is  in  the  vicinity  of  25  cents  per 
test.  At  many  institutions  the  results  are  captured  in 
electronic  form  in  clinical  laboratory  minicomputers  which 
could  easily  be  linked  to  hospital  discharge  abstract 
information.  Vital  sign  information  is  also  important  to 
clinicians,  and  may  be  important  for  these  purposes,  but 
they  would  require  expensive  original  data  collection,  both 
in  the  research  mode  and  later  in  the  final  application 
mode.  The  human  intervention  required  to  obtain  and  record 
the  measures  also  implies  a  larger  potential  for 
measurement  bias.  The  best  physiologic  measure  of  severity 
of  illness,  developed  for  the  entire  hospital  case  mix  of 
medical  and  surgical  patients,  might  use  fewer  or  more 
physiologic  parameters  than  are  in  APACHE  II. 

Finally,  research  to  develop  a  better  measure  of  hospital 
severity  of  illness  needs  to  go  through  specific  stages  in 
developing  the  information.  The  first  step  is  to  determine 
basic  definitions,  and  do  initial  testing  under  controlled 
conditions  within  defined  diseases.  It  is  important  that 
at  this  stage  of  the  research  the  definitions  not  be 
contaminated  or  influenced  by  variations  in  quality  of 
care.  Therefore,  given  the  low  bad  outcome  rates,  this 
stage  cannot  be  based  on  a  shotgun  data  collection  process 
across  a  large  set  of  hospitals.  If  it  is,  the  resulting 
measures  will  degrade  substantially  when  applied  to 
independent  data. 

After  the  first  stage  is  completed  it  is  then  time  to  move 
to  research  on  larger  multi-hospital  data  bases  where  more 
precise  estimates  of  inter-hospital  variation  in  quality  of 
care  may  be  obtained.  In  this  second  stage  direct  measures 
of  the  process  of  care  are  necessary,  so  that  the  cause  of 
variations  between  expected  and  observed  outcome  may  be 
identified.  Finally  after  more  extensive  multi-hospital 
testing  it  would  then  be  appropriate  to  move  into  regional 
or  national  policy  applications. 
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FIGURE  2 
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FIGURE  3 
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FIGURE  H 


Severity-Predicted  And  Observed  Death  Rate* 
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FIGURE  5 


APACHE  II  AND  HOSPITAL  DEATH  RATES 
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MORTALITY  DATA 

What  is  the  utility  of  risk-adjusted  mortality  as  a 
monitoring  tool? 

*  A  RAMO  (Risk-Adjusted  Monitoring  of 
Outcomes)  system  for  mortality  has  the 
advantage  of  using  an  important  measure  that 
is  readily  available  from  existing  data. 
Like  all  outcome  measurement  systems,  it  has 
the  additional  advantage  of  being  able  to 
identify  superior  as  well  as  inferior 
performance. 

*  The  preliminary  results  of  a  Maryland  study 
of  in-hospital  mortality  for  surgical  cases 
seem  to  show  that  a  RAMO  system  can 
accurately  identify  logical  differences  in 
the  risk-adjusted  outcomes  of  these  cases. 


By  Mark  S.  Blumberg,  M.D. 

The  work  I  will  describe  today  is  rather  small-scale.  I 
set  out  to  do  something  that  wouldn't  take  15  years  and 
wouldn't  need  a  revolutionary  new  data  system.  Instead,  I 
wanted  something  that  would  be  workable  and  useful  within 
the  many  limits  of  available  data. 

I  chose  to  study  mortality  as  an  outcome  of  surgery,  but  I 
am  equally  interested  in  other  outcome  measures  such  as 
nosocomial  infections  and  other  therapeutic  misadventures. 
It  happens  that  mortality  is  reasonably  well  documented  in 
hospital  records.  It  is  also  still  important.  While  I 
chose  to  study  in-hospital  mortality,  I  would  have 
preferred  using  all  mortality  within  30  days  of  surgery, 
but  it  was  not  available  from  the  discharge  abstract. 

Figure  1  gives  an  analytic  schema  for  the  study  of 
outcomes.  You  will  note  that  items  1  through  5  are  linked 
by  double  arrows,  because  this  is  an  iterative  process. 
The  choice  of  an  outcome  measure  automatically  limits  you 
in  terms  of  available  data  and  the  universe  of  cases  you 
want  to  study.  You  may  need  to  go  back  and  modify  the 
measure  or  the  set  of  cases  to  which  it  is  applied  in  order 
to  arrive  at  a  meaningful  system.  There  is  no  optimum 
answer;  it  is  the  art  of  the  possible. 

In  Figure  2,  I  have  tried  to  set  down  some  specifications 
for  deciding  how  feasible  it  is  to  analyze  a  given  outcome, 
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and  in  Figure  3  I  have  listed  some  of  the  possible  uses  of 
what  I  call  a  RAMO  (Risk-Adjusted  Monitoring  of  Outcomes) 
system.  The  charts  are  self-explanatory,  but  I  would  like 
to  point  out  that  in  contrast  to  process  measurement 
systems,  where  you  can  only  identify  failures,  an  outcome 
study  gives  you  the  ability  to  identify  institutions  and 
other  providers  who  do  much  better  than  expected  as  well  as 
those  who  do  much  worse. 

I  will  move  now  to  an  example  of  a  RAMO  system,  a  recently- 
completed  study  of  Maryland  hospitals  done  under  contract 
to  the  state's  Health  Services  Cost  Review  Commission. 
Because  the  findings  are  still  under  review,  the  hospitals 
in  the  following  charts  are  not  identified  by  name. 


Designing  the  Maryland  Study 

The  Maryland  project  covered  hospital  surgery  cases. 
Figure  4  shows  the  process  by  which  I  further  defined  this 
category,  starting  with  an  overall  set  of  205,000  cases  and 
3,400  deaths  for  a  death  rate  of  1.69%   (in  Box  C) . 

Prior  study  had  indicated  to  me  that  the  best  predictor  of 
death  for  a  surgical  case  was  whether  the  surgery  was 
elective  or  not.  Not  all  hospitals  code  elective  surgery 
accurately,  however,  so  I  tested  coding  with  some  simple 
edits.  If  a  hospital  had  a  lot  of  emergency  bunionectomies 
or  elective  appendectomies  I  felt  that  something  was  wrong 
with  their  coding.  These  edits  found  11  hospitals  with 
suspect  coding,  and  I  eliminated  them  from  the  study  (Box 
D)  . 

I  also  eliminated  cases  with  a  secondary  malignancy 
diagnosis.  I  do  not  feel  that  one  hospital  episode  is  the 
proper  way  to  measure  the  outcome  of  a  protracted,  serious 
condition  such  as  advanced  cancer.  I  left  out  obstetrics; 
they  had  almost  no  deaths.  I  left  out  open  chest  heart 
massage  when  it  was  the  only  surgery  because  it  is  not  a 
cause  of  death;  the  causation  runs  in  the  other  direction. 
If  in  fact  a  patient  had  a  cholecystectomy,  followed  by 
open  heart  massage  and  death,  that  patient  was  considered 
as  a  death  following  cholecystectomy. 

I  was  left,  then,  with  101,000  elective  cases  (in  Box  I) 
that  had  a  death  rate  of  only  .36%,  and  52,000  non-elective 
cases  (in  Box  J)  with  a  death  rate  of  3.95%.  That  is 
nearly  11  times  as  high,  showing  that  elective  surgery  is 
an  excellent  predictor  in  itself  if  it  is  accurately  coded. 
The  remainder  of  the  analysis  was  performed  on  the  non- 
elective  cases  in  Box  L. 
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The  top  of  Figure  5  shows  the  discharge  abstract  variables 
used  as  the  sources  of  the  risk  predictors  given  in  more 
detail  at  the  bottom  of  the  figure.  In  diagnosis,  I 
trusted  all  the  principal  diagnoses  as  having  been  present 
when  the  patient  entered  the  hospital,  because  that  is  a 
definition  of  principal  diagnosis.  I  could  not  trust  the 
other  diagnoses  in  that  way  because  it  was  often  impossible 
to  tell  a  complication  from  a  co-morbidity  which  was 
present  on  admission— -as,  for  example,  a  secondary 
diagnosis  of  acute  broncho-pneumonia .  I  did,  however, 
allow  all  chronic  conditions  and  all  non-hospital  trauma 
cases,  because  they  had  to  have  preceded  admission. 

You  have  heard  a  talk  today  on  physiological  measures  as 
risk  predictors.  I  would  love  to  have  data  on  them,  and  if 
I  did  I  would  add  a  fourth  column  to  Figure  5— but  I  would 
not  throw  away  diagnosis,  procedure,  age,  and  sex.  I 
cannot  tell  ahead  of  time  what  the  best  risk  predictor  will 
be  for  a  specific  condition.  With  more  pertinent  predictor 
information,  better  models  can  be  built. 

Figure  6  tests  the  model  for  broad  categories  of  cases  by 
comparing  observed  and  expected  deaths  within  major  surgery 
categories,  and  the  model  does  achieve  matches  that  are 
within  statistical  variation  (i.e.,  by  comparing  the 
numbers  of  observed  deaths  to  the  expected  deaths  on  each 
line) .  Divide  the  same  cases  into  trauma  and  non-trauma, 
and  the  results  are  still  valid. 

Figure  7  is  a  more  precise  test  of  the  model,  using  a 
segment  of  gastro-intestinal  surgery.  Where  there  are 
enough  cases  to  yield  a  valid  result,  the  model  has  very 
close  matches  of  observed  and  expected  deaths.  This  is  a 
challenge  I  give  to  all  other  models  vying  to  measure 
mortality  outcomes:  does  their  accuracy  extend,  to  specific 
procedures?  If  it  does  not,  it  can  be  torn  apart  by 
doctors  claiming  to  do  more  difficult  procedures  than  the 
norm  within  a  given  surgical  category. 


Some  Results  of  the  Maryland  Study 

Figure  8  shows  the  results  of  the  model,  in  descending 
order  of  total  volume  of  cases.  Hospital  A  is  by  far  the 
largest  because  it  is  the  state-wide  trauma  center,  and  it 
is  also  a  major  teaching  hospital  that  gets  very,  very 
difficult  cases  of  all  kinds.  The  only  one  close  to  it  in 
volume,  Hospital  B,  is  Maryland's  other  major  teaching 
hospital . 

The  key  to  interpreting  the  results  is  the  chi-square 
column?   if  it  is  4  or  higher,   it  bears  investigation.  The 
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first  hospital  with  a  chi-square  of  over  4  is  Hospital  N, 
which  had  15  deaths  vs.  26.3  expected — extremely  favorable. 
Even  more  favorable  is  Hospital  DD,  with  3  deaths  vs.  10.7 
expected . 

On  the  other  hand,  Hospital  FF  had  17  deaths  vs.  10 
expected,  and  Figure  9  applies  the  model  to  this  hospital 
in  more  detail  to  show  that  the  potential  problem  centers 
on  the  gastro-intestinal  service.  The  chart  also  gives  you 
a  similar  analysis  of  Hospital  L.  Overall  this  hospital 
was  only  slightly  adverse,  but  an  examination  by  service 
identifies  a  possible  serious  problem  with  cardiovascular 
surgery.  Thus  one  can  have  an  acceptable  hospital  with  a 
single  questionable  service.  For  this  reason  I  think  that 
any  monitoring  system  must  give  data  not  only  for  the  whole 
hospital,  but  also  on  a  service-specific  basis. 

Figure  10  shows  another  way  to  arrange  the  data:  by  season 
of  the  year.  It  breaks  out  the  five  weeks  of  July,  all 
other  weeks  with  a  national  holiday,  and  all  other  weeks. 
Look  at  the  national  holiday  weeks:  144  observed  deaths, 
120  expected.  There  is  also  a  suspicion  of  a  problem  at 
the  two  major  teaching  hospitals  in  July.  We  have  also 
found  differences  by  days  of  the  week,  not  shown  here. 

In  another  arrangement  of  the  data,  we  found  some 
interesting  differences  by  source  of  payment:  Medicare 
cases  in  non-teaching  hospitals  did  not  do  as  well  as  those 
in  teaching  hospitals.  We  would  never  have  known  that  had 
we  not  had  comparisons  with  non-Medicare  cases — so, 
although  Medicare  is  primarily  interested  in  Medicare,  it 
could  be  helpful  for  them  to  study  non-Medicare  patients  in 
the  same  hospitals. 


Conclusion 

Is  our  risk  model  any  good?  We  cannot  prove  that  any  model 
is  good;  we  can  only  try  very  hard  to  prove  that  it  is  bad 
in  some  fashion.  I  am  now  working  with  Maryland  hospitals 
on  this,  and  they  are  trying  to  find  flaws.  I  do  not  know 
whether  I  have  bad  data  or  irregular  coding,  whether  I  have 
failed  to  include  important  variables,  whether  the  model  is 
biased  in  some  way,  or  whether  apparent  disparities  in 
observed  and  expected  deaths  are  due  to  chance  alone.  It 
is  impossible  to  eliminate  chance,  but  we  are  working  to 
test  the  other  factors. 

In  summary,  I  think  work  of  this  sort  is  feasible  if  you 
approach  it  intelligently,  make  progressive  incremental 
improvements,  and  avoid  the  temptation  to  revolutionize  the 
world  of  research.     All  of  us  could  use  better  data.  That 
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is  where  the  money  should  go  in  the  short  run.  When  I 
build  these  models,  I  spend  80%  of  my  time  cleaning  up 
data — and  I  would  much  rather  study  the  epidemiology  of 
good  care  than  the  epidemiology  of  good  data. 


Dr.  Mark  Blumberg  is  a  physician  who  has  been  engaged  in 
developing  a  variety  of  analytic  methods  for  health  care 
service  data  through  his  work  as  Director  of  Special 
Studies  for  Kaiser  Foundation  Health  Plan,  Inc.  He  was 
elected  a  member  of  the  Institute  of  Medicine  of  the 
National  Academy  of  Sciences  in  1980.  He  has  recently 
served  as  an  advisor  to  several  organizations  concerned 
with  measuring  outcomes  of  care  (JCAH,  HCFA) . 


8-5 


FIGURE  1    An  Analytic  Schema  to  Develop  Risk-Adjusted  Outcomes 


1.  Select  universe  for  study  (e.g.,  persons,  patients,  cases) 


i 


2.  Select  clinical  care  subject  (e.g.,  diagnosis,  symptom, 
procedure) 


t 


3.  Dependent  variable: 

Select  appropriate  measure(s)  of  outcome 
(e.g.,  death,  disability) 


I 


4.  Independent  variables: 

List  candidate  patient  or  care  attributes  to  measure  risk 
of  adverse  outcome  (N.B.:  these  must  be  present  prior  to 
the  care  being  studied.) 


5a.  Review  data  resources  to 

determine  which  variables  are 
suitable 


5b.  Select  cases  to  serve  as 
standard 


T 


> 


T 


6.  Select  technique  to  estimate 
expected  risk  of  adverse 
outcome  for  each  case 


7a.  Select  final  patient  or  care 
attribute  variables  to  measure 
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Mark  S.  Blumberg,  M.D. 


Figure  2 

FACTORS  RELATING  TO  FEASIBILITY  OF 
RISK -ADJUSTED  MONITORING  OF  OUTCOMES 


Factor 


1  Health  Condition 
Agreement  on  Definition 
Duration 

Recovery  from  Condition 
Spontaneous  Remissions 


Relative  Feasibility 


More 


High 

Acute  or  Acute  Flare- 
up  of  Chronic  Condition 

Common 

Rare 


Less 


Low 

Chronic 

Rare 
Common 


2  Health  Care  Service 
Agreement  on  Definition 


Palliative 


Yes  (e.g.  most  surgery) 


No 


No  (e.g.  many 
hospitalized  medical 
cases) 

Yes 


3  Adverse  Outcome 
Measurement  or  Definition 


Accurately  Recorded 

4  Availability  of  Data 
on  Outcomes  and  Risk"  Factors 

Computerized 


Standard  Coding 


Easy  (particularly  if 
categorical,  e.g.  death) 


Criteria  for  Classification  Objective 
Incidence  of  Adv.  Outcome 


Not  Rare  (5  or  more 
expected  per  year 
in  hospital) 

Yes 


Yes  (e.g.  hospital 

discharge  abstract) 

Yes  (e.g.  I CD- 9 -CM  for 
diag  $  procedure) 


Difficult  (e.g. 
quality- ad jus ted 
life  years) 

Subjective 

Rare  (Under  1 
Expected  per  year 
in  hospital) 

No 


No  (e.g.  office  care) 


No  (e.g.  office  care) 
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Mark  S.  Blumberg,  M.D. 


Figure  3 

POTENTIAL  PURPOSES  OF  RAMO  SYSTEMS 
(RAMO  «  Risk-Adjusted  Monitoring  of  Outcomes) 

*  1.  To  identify  specific  providers  which  have  outcomes 

a.  worse  than  expected, 

b.  better  than  expected. 

2.  To  determine  whether  there  are  (cross-sectional)  differences  in 
outcomes  by 

*  a.    type  of  provider  (e.g.,  teaching  hospitals,  proprietary 

hospitals,  local  government  hospitals),  or 

*  b.    alternate  methods  of  paying  providers  (e.g.,  FFS,  PPO,  IflO), 

c.  area  of  the  country, 

*  d.    provider  experience  or  volume. 

3.  To  measure  trends  in  outcomes  over  time  to  assess  impacts  of 

a.  changes  in  payment  (e.g.,  PPS  for  Medicare), 

b.  medical  technology, 

c.  activities  of  PROs. 

*  4.  To  monitor  outcomes  so  that  clusters  of  unexpected  adverse  outcomes  are 

detected  and  investigated  (e.g.,  equipment  failures,  psychopathic 
providers,  infectious  disease  outbreaks). 

5.  To  measure  the  relative  risk  of  adverse  outcomes 

a.  by  patient  characteristics  that  are  associated  with  differing 
levels  of  expected  risk, 

b.  by  provider,  payment  source,  etc. 

6.  To  detect  inconsistent  data  by  noting  unexplained  changes  in  expected 
risks  by  time  and  place. 

*    Examples  given  later  in  this  presentation 
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Figure  6 
HAKYLAND  DISCHARGE  DATA  4/84-3/85 
HIGH  RISK  EMERGENCY  I  AEDIUH  RISK  NON-ELECTIVE  SURGICAL  CASES 
(DATA  SUPPRESSED  ON  LINES  WITH  5  OR  LESS  CASES) 


TRAUHAP  .  SURCCAT  CASES  OBS'D     EXPECTED  CHI2         ISNR     EXPECTED  G6S-EXP 

DEATHS     DEATHS  OBS  EXP    DRATE  MATE 

DEATHS     x  m      x  m 


.   TOTAL                     .  8745 

1*NERV  SYS  PROC  987 

2-ENDO  SYS  PROC  1? 

3*EYE  PROCEDURE  11 

5SN0SE,NTH,PHARY  26 

6*R£SP  SYS  PROC  371 

7=CARD  SYS  PROC  2342 

'  :'"'^:!  175 

9«DiG£ST  STS  PRC  4917 

19*UR imtl  PROC  278 

l2=FIfi  GEN  PROC  12 
13*OBSTETRIC  PRC 

14^HJSCUL0SXLTAL  476 

15»IWTE&W(WT)  32 

PROCEDUR  .  v 

TRAUMA        ALL  .1328 

IsNERV  SYS  PROC  479 
2=ENW  SYS  PROC 
3*EYE  PROCEDURE 
5=NOSE.HTH,PHftfiY 

6*RESP  SYS  PROC  50 

7«CARD  SYS  PROC  139 

8»(€«IC  4  LYMPH  199 

9*DIG£ST  SYS  PRC  417 

19»URINARY  PROC  (4 
12<EN  gen  PROC 

14*HUSCUL0SSLTAL  124 

\5*wm<mmn)  13 
mm  wmm 

NON  TRAUHA  Ail  7417 

1*NERV  SYS  PROC  517 

i*mF:\>..m  8 

5*NOJE,NTH,PHJKI  24  , 

6-RESP  SYS  PROC  321 

?*CARD  SYS  PROC  2212 

8*HEMC  4  LYKPH  75 

9=DIGEJT  HS  PRC  3696 

1HJRINARY  PROC  264 

12«PEH  GEN  PROC  19 
13«0KTETRI£  PRC 

i4*HUICUL0SKLTAl  346 

I5«INTEGUH(BRST)  1?  . 
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Figure  7 

mryiaw  discharge  data  4/84-3/85  15 15  Tuesday,  aprh  n.  \m 

high  risx  iimim  •  nedium  ris*  non-elective  surgical  cases 

FOR  PATIENT!  WITHOUT  PDX  OF  TRAUAA.MTH  61  PROCEDURES 


CASES 

OIS'D 

EXPECTED 

CHI2 

im 

EXPECTED 

BJJ-OP 

DEATHS 

DEATHS 

OIS  EXP 

DRATE 

DRATE 

DEATHS 

X  19 

y 

X  144 

444 

443.476 

1.44 

1.441 

12 

.32 

9  a< 

4221 «  ESOPHAGOSCflPY  BY  INCH 

• 

1.489 

6.444 

12 

.22 

-12.22 

4239*  DEITRUCT  ESOPHAG  OS  NEC 

3 

5.451 

1.11 

4.554 

34 

.44 

-15.31 

4249*  ESOPHAGECTOMY  NOS 

• 

1.277 

4.444 

(3 

.84 

-13.84 

4241*  PARTIAL  EOTm£CTOHT 

• 

l.3?« 

4.444 

12 

.31 

-12.35 

4242*  TOTAL  ESOPHAGECTOMY 

• 

•.495 

4.444 

9 

91 

-9.91 

4251 «  THORAC  ESQJmQESOPHMCOJ 

#.447 

1.544 

44 

67 

33.33 

4232*  THORAC  ESOPHAGOGACTROST 

5 

3.411 

1.444 

42 

43 

19.87 

4259*  im&  tm*&  mn  *c 

2 

1.734 

1.153 

57 

84 

8.8? 

42t>  SUTURE  ESOPHAGEAL  LACED 

1 

1.349 

4.444 

34 

.91 

-34.91 

4219*  ESOPHAGEAL  REPAIR  «C 

1 

(.741 

9.548 

22 

.91 

-9.51 

4291*  INJECT/LIGAT  EJOPH  VARIX 

9 

11.244 

4.14 

4.877 

15 

.32 

-1.89 

m  *  GASTROTOHY 

(2 

13.221 

4.11 

4.94S 

35 

.73 

-3.34 

4142*  LOCAL  GASTR  EXCISION  KC 

2 

1.771 

1.134 

19 

.42 

*  1.35 

4349*  LOCAL  GASTR  DOTRUCT  NEC 

2 

3.211 

9.423 

9 

.17 

-3.44 

436  *  DISTAL  GASTRECTOMY 

2 

2.223 

9.944 

6 

.91 

-9.44 

437  *  PART  mm  V  JEJ  ANAXT 

16 

17.429 

4.15 

9.944 

13 

.24 

-1.23 

4389*  PARTIAL  GASTRECTOMY  NEC 

1 

1.211 

4.826 

5 

.27 

-4.92 

4399*  TOTAL  GASTRECTOMY  NEC 

3 

2.237 

1.341 

24 

.85 

8.48 

44M*  vagotomy  m 

• 

1.314 

9.444 

5 

.26 

-5.26 

4491 «  TRUNCAL  VAGOTOMY 

2 

3.143 

4.432 

7 

.72 

-2.84 

4411*  TRAWAIDOHIN  GASTRQSCOPY 

1 

1.459 

4.944 

52 

.93 

-2.95 

442  *  PYLOROPLASTY 

4 

5.984 

4.44 

4.449 

19 

.31 

-3.41 

4439*  GASTWJEHTEROSTONY  NEC 

8 

9.383 

4.24 

4.853 

14. 

18 

-2.38 

44#«  mm  peptic  ulcer  «s 

4 

3.944 

1.419 

33 

W 

9.33 

12 

15.249 

4.74 

4.786 

29 

92 

-4.48 

4442*  SUTURE  SU0KN  ULCER  fITE 

29 

29.139 

4.44 

4.995 

22. 

94 

-9.11 

44«>  close  gastric  Finn  CC 

• 

4.544 

9.444 

59 

44 

-54.44 

4449*  GASTRIC  REPAIR  NEC 

2 

1.397 

1.432 

9 

31 

4.42 

4491*  LIGATE  GASTRIC  VARICES 

t 

4.374 

■ 

9.944 

18. 

79 

-18.79 

4499*  GASTRIC  OPERATION  NEC 

2 

1.117 

1.794 

22 

35 

17.65 

4344*  INTESTINAL  INCISION  NOS 

2 

1.971 

1.915 

4 

.86 

9.14 

4*41*  DUODENAL  INCISION 

4 

3.943 

1.925 

2i 

.48 

4.54 

4542*  SHALL  IML  INCISION  NEC 

3 

3.312 

4.887 

12. 

48 

-1.34 

4543*  LARGE  KNEL  INCISION 

1 

4.843 

1.187 

4 

44 

9.83 

4532*  KSTRUCT  MMM  LES  NEC 

4 

3.399 

1.177 

17, 

94 

3.44 

4S51*  SH  KMEL  SEGMENT  ISOLAT 

2 

4.792 

2.523 

24. 

44 

44.26 

4541*  HILT  SEG  SN  tONEL  EXCIS 

4 

3.214 

1.244 

32. 

14 

7.84 

4542*  PART  SN  MMEL  RESECT  NEC 

27 

25.441 

4.15 

1.471 

9. 

#7 

4.71 

4543*  TOTAL  RENOVAL  SH  KMEL 

2 

2.744 

4.741 

38. 

57 

-19.44 

4571*  HJLT  JEG  LG  WML  EXCIS 

4.798 

1.253 

24. 

61 

6.73 

4572*  CECECTOHY 

4.453 

1.975 

7. 

43 

1.57 

4573*  RIGHT  HEHICBLiaOMY 

14 

17.153 

9.18 

4.933 

7. 

Q 

-4.52 

45748  TRANSVERSE  COLON  RESECT 

3.889 

4.771 

8. 

14 

-1 .85 

4575*  LEFT  HEMICOLECTOMY 

2.934 

4.482 

4. 

17 

-1.34 

4576-  SIGMOIDECTOMY 

5 

9.191 

1.91 

9.544 

4. 

41 

-2.44 

4579*  PART  LG  WMEL  EXCIS  NEC 

14 

14.343 

4.91 

4.947 

11. 

G? 

-4.39 

451  *  TOT  INTRA- AID  COLECTOMY 

2 

1.441 

1.421 

i. 

21 

3.48 
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Figure  S 
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Figure  9 


MARYLAND  DISCHARGE  DATA  4/84-3/85  14  43  FRIDAY,  APRIL  24,  1987  4 

HIGH  RISK  EMERGENCY  4  MEDIUM  RISI  NON-ELECTIVE  SURGICAL  CAJES 
(DATA  SUPPRESSED  »  LIMEI  WITH  5  OR  LESS  CASES) 


TRAURAP 

OTGCAT 

CASES 

OBS'D 

EXPECTED 

CHI2 

ISMR 

EXPECTED 

OBS-tXP 

• 

DEATHS 

DEATHS 

OBS  EXP 

DRATE 

DRATE 

DEATHS 

X  166 

X  166 

* 

ALL 

68 

17 

16.293 

4.37 

1.652 

15.14 

9.86 

LxOCfP  cyc  MAT 

• 

• 

7*CARD  SYS  PROC 

ie 

3 

2.106 

1.428 

21.66 

9.66 

9*DIGEST  SYS  PRC 

42 

11 

5.942 

4.31 

1.851 

14.15 

12.64 

ie=UR INAPT  PROC 

6 

1 

9.290 

3.454 

4.83 

11.84 

MsMUJCULOrKLTAL 

6 

2 

1.215 

1.647 

26.24 

13.69 

TRAUMA 

ALL 

• 

• 

• 

• 

• 

• 

A*RFfP  ?YJ  PROC 

• 

• 

* 

• 

• 

• 

9*DIGEST  SYS  PRC 

• 

• 

* 

• 

14*NUSCUL0mTAL 

• 

* 

• 

• 

• 

• 

NON  TRAUHA 

ALL 

■ 

17 

9.739 

5.41 

1.746 

14.98 

11.17 

6=ROP  SYS  PROC 

■ 

• 

• 

* 

• 

• 

7=CARD  SYS  PROC 

te 

3 

2.166 

1.428 

21.66 

9.66 

9*DI6EST  SYS  PRC 

41 

11 

5.936 

4.32 

1.853 

14.48 

12.35 

10-URINARY  PROC 

6 

1 

0.296 

3.454 

4.83 

11.84 

MsHUSCULOSKLTAL 

• 

• 

« 

• 

• 

• 

• 

ALL 

228 

36 

29.384 

1.49 

1.225 

12.39 

2.;e 

1=*£RY  SYS  PROC 

15 

2 

2.636 

6.982 

13.57 

-6.24 

2=ENDfl  SYS  PROC 

• 

• 

* 

• 

• 

6«RCP  SYS  PROC 

12 

2 

2.666 

0.768 

21.72 

-5.35 

7»CARD  SYS  PROC 

62 

1? 

11.578 

4.76 

1.641 

18.67 

11.97 

8*t€MIC  4  LTHPH 

6 

9 

6.372 

A  AAA 

v.  vww 

6.26 

-6.29 

9*DIGEST  SYS  PRC 

112 

12 

11.276 

6.65 

1.664 

16.97 

6.a5 

10sURINARY  PROC 

8 

A 

6.232 

6.666 

2.96 

-2.°6 

Q 

a 

A 

A  AAA 

0  7A 
".  (0 

15*INTEGUMBRST) 

• 

. 

. 

TRAUHA 

ALL 

6 

e 

0.421 

6.669 

7.61 

-7.61 

• 

• 

7«CARD  SYS  PROC 

• 

* 

8*HEMC  &  LYMPH 

• 

9-DIGEST  SYS  PRC 

• 

14MHJSCULOSXLTAL 

• 

a 

NON  TRAUHA 

ALL 

222 

36 

28.964 

1.71 

1.243 

13.65 

3.17 

1=NERY  SYS  PROC* 

14 

2 

1.879 

1.664 

13.42 

6.86 

2=ENDO  SYS  PROC 

• 

6*R£SP  SYS  PROC 

12 

2 

2.666 

6.763 

21.72 

-5.65 

7=CARB  SYS  PROC 

61 

1? 

11.549 

4.81 

1.645 

18.93 

12.21 

8*HENIC  I  Um 

• 

• 

^DIGEST  SYS  PRC 

111 

12 

11.676 

6.68 

1.683 

9.98 

6.83 

1  PRIMARY  PROC 

8 

• 

6.232 

9.666 

2.96 

-2.96 

14*HUSCUL0SKLTAL 

7 

6 

1.752 
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10.74 

-16.74 

l5«INTECUmBRST) 
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PROCESS  MEASURES 

What  is  the  role  of  health  care  process  measures:  Is 
a  balanced  approach  to  management  of  health  care 
quality  feasible? 

*  Neither  process  measures  nor  outcome 
measures  can  measure  quality  of  care 
satisfactorily  when  used  alone.  A  quality 
assurance  system  must  combine  the  two  to  be 
effective. 

*  The  quality  assurance  system  used  in  U.S. 
military  hospitals  uses  process  measures 
based  on  physician  consensus  about  criteria 
for  treatment  of  various  conditions,  coupled 
with  population-based  outcome  measures. 

*  Because  the  military  system  is  computerized, 
it  utilizes  precisely  defined  data 
indicators  and  objective,  quantifiable 
criteria. 


By  William  B.  Munier,  M.D. 

As  long  as  I  can  remember,  there  has  been  a  debate  about 
whether  one  should  look  at  process  or  outcome.  Process 
measures  were  very  much  in  vogue  during  my  days  as  Director 
of  the  federal  PSRO  program.  They  have  fallen  into 
disfavor  now,  and  outcomes  are  seen  as  the  only  real 
measure  of  quality.  In  my  view,  however,  both  are  very 
useful  when  employed  as  contributory  tools  in  designing  and 
operating  an  effective  quality  measurement  system. 

It  has  already  been  mentioned  today  that  physicians  are 
very  process-oriented.  In  their  view,  process  drives 
outcome.  It  is  not  the  only  determinant  of  outcome; 
obviously,  the  "raw  material"  (that  is,  the  patient)  is 
important.  But  what  the  physician  does  is  important,  too- — 
and,  in  fact,  from  their  standpoint  the  processes  they 
employ  are  the  total  of  their  accountability  for  quality. 

The  reason  a  quality  measurement  system  must  examine  both 
process  and  outcome  is  straightforward;  there  is  a  paucity 
of  health  science  information  that  links  processes  to 
outcomes.  If  there  were  a  one-to-one  relationship  between 
what  the  doctor  did  and  what  happened  to  the  patient,  on  a 
problem-  or  diagnosis-specific  basis,  the  mystery  would  be 
over.  But  there  is  no  such  simple  linkage,  and  the  medical 
literature  does  not  examine  possible  links  in  any  great 
depth  or  to  any  great  degree.      When  I  was  with  the  PSRO 
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program  and  frustrated  by  both  the  lack  of  national 
standards  and  the  lack  of  information  upon  which  PSRO 
physicians  could  base  their  process  criteria,  I  initiated  a 
study  of  the  underlying  health  science  information  for 
three  diagnoses.  The  study,  supported  by  NIH,  demonstrated 
that  the  underpinnings  of  diagnosis  and  treatment  for  these 
diagnoses  were  not  very  firm.  In  fact,  they  were 
shockingly  infirm. 

In  the  absence  of  hard  scientific  information  linking 
process  to  outcome,  it  is  very  hard  to  pin  down  whether 
specific  processes  are  truly  related  to  the  poor  outcomes 
one  is  trying  to  correct.  In  addition,  because  most 
quality  assurance  efforts  do  not  have  the  controls  found  in 
clinical  trials  (and  I  don't  think  anyone  has  ever 
suggested  that  a  quality  assurance  process  should 
approximate  a  clinical  trial) ,  there  will  be  uncontrolled 
variables  that  can  confound  the  attempt  to  link  process  and 
outcome.  This  ambiguity  shows  why  one  really  does  need  to 
look  at  both  measures. 

There  is  another  reason  for  using  both,  and  that  is  the 
complexity  of  the  medical  management  process.  Let  us  take 
a  patient  who  has  an  acute  myocardial  infarction.  There  is 
little  in  the  literature  to  suggest  definitive  steps  that 
must  be  followed  in  order  to  achieve  good  quality  care  and 
a  good  patient  outcome.  A  practical  way  to  generate  the 
desired  information,  a  "best  approximation,"  is  to  assemble 
a  panel  of  physicians  and  arrive  at  consensus  for  treatment 
of  the  uncomplicated  acute  myocardial  infarction.  A  doctor 
then  treats  a  patient  who  has  severe  emphysema  as  well,  and 
he  does  not  follow  all  the  criteria  because  his  patient  has 
emphysema . 

That  is  justified;  the  doctor  makes  those  decisions.  But 
the  criteria  become  more  complex  if  you  try  to  incorporate 
the  additional  complicating  factor  of  emphysema.  If  the 
patient  also  has  hypertension,  that  adds  another  factor, 
and  so  on  for  all  the  combinations  and  permutations  of 
complications  and  co-morbidities,  as  well  as  the  anatomic 
and  physiologic  differences,  with  which  patients  can 
present.  Thus  we  are  faced  with  a  situation  where  process 
criteria  cannot  be  developed,  practically  speaking,  for  all 
the  variations  a  doctor  will  see  in  treating  even  50  or  100 
patients  with  a  given  diagnosis.  This  complexity  largely 
explains  why  people  have  thrown  up  their  hands  about  using 
process  criteria. 

Unfortunately,  moving  to  outcome  criteria  in  isolation  from 
process  does  not  solve  the  problem  either.  As  far  as  I  am 
aware,  there  are  no  national  standards  about  what 
constitutes     appropriate     outcomes     for    most  conditions. 
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There  is  not  even  consensus  about  the  appropriate 
parameters  for  outcomes.  The  slope  is  just  as  slippery, 
whether  you  look  at  outcome  or  process. 


A  Balanced  Approach 

The  question  thus  becomes:  How  do  you  approach  the 
problem,  and  what  is  practical  while  still  being 
scientifically  valid?  I  think  that  looking  at  process  and 
outcome  in  conjunction  with  each  other  is  probably  the  best 
approach,  recognizing  the  limitations  that  both  of  these 
measures  have  at  the  present  time. 

Specifically,  one  approach  that  seems  to  have  some 
functionality  is  to  develop  a  consensus  of  experts  on 
process  criteria  which  deal  with  the  uncomplicated  patient 
who  has  a  given  diagnosis.  These  criteria  can  identify 
care  which  deviates  from  that  which  physicians  have  agreed 
probably  represents  the  best  medical  knowledge  about 
treating  the  uncomplicated  patient.  In  these  cases,  the 
criteria  approach  a  proscription  for  what  you  ought  to  do 
as  a  limit. 

In  a  patient  with  abnormal  anatomy  or  physiology  or  co- 
morbidities, however,  the  criteria  begin  to  break  down.  A 
good  physician  will  not  follow  them  blindly,  but  will  take 
into  account  the  other  factors  that  present  with  the 
patient.  Yet  even  in  these  cases  the  criteria  can  act  as 
screens,  in  the  sense  that  if  they  were  met  it  is  likely 
that  the  physician  was  providing  appropriate  care.  There 
will,  of  course,  be  a  number  of  false  negatives. 

I  also  think  one  can  say  that  when  failed  process  criteria 
appear  in  a  statistically  significant  conjunction  with  poor 
outcomes,  there  is  a  high  degree  of  probability  that 
substandard  care  is  being  delivered.  Therefore,  by 
employing  consensus  criteria  in  conjunction  with  outcomes, 
and  looking  at  what  happens  with  populations  of  patients  in 
an  institution,  for  a  given  physician,  and  by  diagnosis  or 
problem,  one  can  begin  to  approach  valid  measures  of 
performance. 


From  Theory  to  Practice 

I  was  fortunate  enough  to  become  involved  recently  with  a 
very  large  program  of  quality  assurance,  where  I  had  the 
chance  to  design  a  system  using  the  approach  I  have 
described  to  screen  10,000  cases  of  patient  care  per  month. 
As  many  of  you  know,  there  is  only  one  ongoing  quality 
assurance  effort  of  this  magnitude,  and  that  is  the  program 
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at  U.S.  military  hospitals  worldwide.  I'd  like  to  show  you 
how  that  system  works  as  well  as  some  of  the  results. 

In  design,  the  system  always  starts  with  medical  criteria 
and  standards.  Figure  1  shows  the  actual  criteria  and 
standards  developed  by  the  American  College  of 
Obstetricians  and  Gynecologists  for  review  of  primary 
Caesarean  section  in  this  program.  I  would  just  like  to 
make  a  few  general  observations  about  these  criteria. 

First,  they  are  very  specific.  For  example,  one  of  the 
indications  for  C-sections  is  "failure  to  progress".  I'm 
told  this  can  be  a  catchall  explanation  for  anything  that 
won't  qualify  under  any  of  the  harder  categories,  and  it 
may  be  used  to  justify  questionable  C-sections.  Therefore, 
it  is  defined  very  specif ically:  failure  to  progress  in 
labor  for  more  than  two  hours  after  the  cervix  has  been 
dilated  greater  than  or  equal  to  four  centimeters. 

Then,  since  data  of  this  precision  are  not  always 
documented  in  the  chart,  there  is  a  further  breakdown  that 
classifies  the  case  as  having  an  acceptable  indication  but 
poor  documentation.  We  do  attempt  to  differentiate  medical 
error  from  charting  error.  This  is  not  always  possible;  if 
an  historical  fact  is  not  in  the  medical  record,  we  cannot 
prove  that  it  happened.  But  where  it  is  possible — such  as 
a  notation  that  a  lab  test  was  ordered  but  no  corresponding 
notation  of  the  results™— we  can  make  a  distinction  between 
definite  medical  error  and  potential  medical  error  or 
potential  charting  error. 

A  system  as  explicit  as  this  allows  us  to  tailor  the 
screens  to  our  needs.  You  will  note  that  one  of  the 
indications  for  C-section  is  active  perineal  herpes.  This 
was  not  in  the  criteria  set  originally,  as  it  was  thought 
to  be  a  relatively  uncommon  justification.  When  a  number 
of  such  cases  appeared  every  month,  however,  the  criterion 
was  built  into  the  system. 

Explicit  criteria  make  the  process  of  review  more  efficient 
and  more  objective.  A  quality  measurement  system  employing 
explicit  process  criteria  can  be  characterized  as  (although 
I  dislike  the  term)  a  glorified  exception  reporting  system 
in  which  certain  charts  are  presumed  to  represent  adequate 
care  and  others  are  captured  for  review  by  physicians 
because  the  probability  of  substandard  care  is  presumed  to 
be  higher. 

What  happens  after  the  criteria  are  constructed?  This 
system  is  dependent  on  abstracting  data  out  of  the  medical 
record,  which  proved  to  be  a  helpful  constraint  because  it 
forced     us     to     define     the     information    we    wanted  very 
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precisely.  This  information  allows  one  to  determine 
whether  criteria  have  been  met,  on  a  diagnosis-  or  problem- 
specific  basis.  Therefore,  by  direct  inference,  this  data 
base  contains  the  most  important  data  in  the  medical  record 
with  respect  to  quality. 

A  second  reason  for  precision  is  that  the  entire  system 
runs  on  a  computer,  and  computers  cannot  deal  with 
subjective  judgments;  they  require  numbers.  I  should  also 
note  that  the  people  who  take  the  data  out  of  the  medical 
record  are  non-physician  personnel,  and  it  is  unwise  to  ask 
for  medical  judgments  at  this  level.  We  therefore  tried  to 
wring  all  judgment  out  of  the  questions,  making  them  as 
objective  as  possible — and  we  hired  the  most  intelligent 
abstractors  we  could  find,  so  that  what  little  judgment 
remained  would  be  addressed  adequately. 

The  computer  uses  the  abstracted  data  to  create  a  mini- 
medical  record  which  serves  as  the  tool  for  physician 
reviewers.  (The  full  record  is  examined  only  infrequently, 
when  the  pertinent  quality  assurance  information  is 
insufficient.)  Figure  2  shows  a  mini-medical  record, 
precisely  the  data  set  abstracted  from  the  medical  record, 
for  the  procedure  hysterectomy.  Figure  3  shows  a  typical 
report,  providing  screening  information  for  all  military 
hospitals  and  broken  out  by  service.  The  report  is  for 
cholecystectomy,  and  it  shows  the  linkage  of  process  and 
outcome.  Figure  4  shows  a  report  for  one  service,  broken 
out  by  MTF  (military  hospital)  .  You  will  note  that  there 
is  a  potential  problem  with  one  institution. 

Figure  5  moves  down  another  level,  to  give  a  detailed 
picture  of  that  institution.  Each  number  at  the  top 
represents     a     doctor.  Interestingly,      each     of  the 

questionable  cases  involved  a  different  doctor.  And  the 
pathology  did  justify  the  operation  in  each  case  even 
though  the  usual  indications  for  surgery  were  missing. 
This  example  might  represent  a  charting  error,  or  it  might 
represent  doctors'  intuition  at  work. 


Conclusion 

I  have  used  surgical  examples  because  they  are  clearer  and 
simpler,  but  this  method  works  equally  well  for  medicine. 
The  steps  are  exactly  the  same:  arrive  at  consensus  about 
the  criteria  for  acceptable  processes  and  outcomes  in  a 
given  diagnosis  or  problem,  define  the  data  set,  link 
criteria  and  data  sets  with  medical  logic,  and  construct 
the  required  system  (on  computer  or  by  hand)  to  review 
cases.  Such  a  quality  measurement  system  is  accountable  on 
a   case-by-case   basis,    by   diagnosis,    by  physician,    and  by 
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institution.  Depending  on  the  number  of  cases  that  pass  or 
fail  over  time,  the  system  will  produce  a  comprehensive 
picture  of  what  kind  of  care  doctors  are  delivering,  as 
judged  by  their  peers. 


*  *  * 

Dr.  William  Munier  is  a  physician  and  President  of  Quality 
Standards  in  Medicine,  Inc.,  a  software  and  consulting  firm 
specializing  in  quality  of  care  assessment.  Dr.  Munier  was 
formerly  Director  of  the  Professional  Standards  Review 
Organization  (PSRO)  program  of  the  Department  of  Health  and 
Human  Services  (then  DHEW) .  More  recently,  he  served  as 
Program  Director  for  the  Department  of  Defense  External 
Civilian  Peer  Review  Program  designed  to  measure  quality  of 
care  in  all  167  military  hospitals  worldwide. 
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FIGURE  1 


TASK  3A:     PRIMARY  CAESARIAN  SECTIONS 


STANDARDS 


Case  based: 


Procedure  is  indicated 


100  % 


Procedure  is  contra-indicated 


0  % 


5  minute  Apgar  score  is  less  than  7 


0  % 


Population  based: 

Rate  of  primary  C-section  is  less  than  15%  of  all 
deliveries 

CRITERIA  . 


a.  Active  perineal  herpes 

b.  Malpresentation  is  recorded  as  transverse  lie, 
breech,  brow,  or  face  on  last  predelivery 
sonogram  or  on  last  predelivery  physical  exam/ 
pelvic  exam 

c.  Failure  to  progress  in  labor  for  more  than  2 
hours  after  cervix  has  been  dilated  greater  than 
or  equal  to  A  centimeters 

(1)  If  chart  documents  "failure  to  progress"  in 
labor  and  failure  to  progress  was  observed 
for  2  or  more  hours  prior  to  Caesarian 
section 

(2)  If  the  chart  does  not  document  the  dilation 
of  cervix  in  centimeters  the  procedure  is 
assumed  indicated  but  the  documentation  is 
characterized  as  inadequate 

d.  Last  predelivery  sonogram  indicates  uterine 
hemorrhage  (third  trimester  bleeding)  from  any 
of  the  following: 

(1)  Placenta  previa 

(2)  Placenta  abruptio 


Screeninp  Categories 


Explanation 


1.     CAESARIAN  SECTION 
PRESUMED  INDICATED 


Surgery  is  assumed  to  be  indicated  if  the  medical 
record  shows  any  of  the  following: 
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FIGURE  1 


TASK  3A:     PRIMARY  CAESARIAN  SECTIONS 
(continued) 


CRITERIA 


Screening  Categories 


Explanation 


2.    CAESARIAN  SECTION 
SEEMINGLY 
CONTRA INDICATED 


Surgery  Is  assumed  to  be  indicated  if  the  medical 
record  shows  any  of  the  fol loving: 

e.  Prolapsed  cord 

f.  Failed  indict ion  from  any  of  the  following: 

(1)  Diabetes  nellitus,  or 

(2)  PIH  (pregnancy  induced 
hypertension/pre-eclampsia) ,  or 

(3)  Postmaturity,  or 

.  (A)    IUGR  (intrauterine  growth  retardation),  or 
(5)    Cardiac  disease 

g.  Fetal  distress  as  indicated  by  one  of  the 
following: 

(1)  Meconium  staining,  or 

(2)  Fetal  heart  rate  less  than  60  as  indicated 
by  fetal  heart  monitoring 

Surgery  Is  seemingly  contraindlcated  if: 

a.  Dilation  of  cervix  is  not  recorded  on  last  pelvic 
exam  and 

(1)  Chart  does  not  indicate  "failure  to 
progress  in  labor",  or 

(2)  "Failure  to  progress"  was  observed  for  less 
than  2  hours  before  scheduling  C-sectlon 

b.  Dilation  of  cervix  on  last  pelvic  exam  is  less 
than  3  cm 
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'••>'  CON" 2 DE NT] At.  QUALITY  ASSONANCE  WORKING  PAPER  •••••     PAGE :  3 

date  prepared:  1bs7         dod  peer  review  d03s102o 

ind:vidjal  patient  abstract 
mtp :  review  period: 


PATIENT  SSN:                             FMP:  30  SPOUSE/X-SPOUSE  D1S  DT  : 
TASK:   33  HYSTERECTOMIES 

1.  Dott  Chart   indicate  patient  hao: 

recurrent  pelvic  irvf larrmatory  outut  (PID)?  NO 

recurrent   anemia?  NO 

2.  Docs  cnen   indicate  patient  experienced  cnromc  pain?  NO 

Is  the  lengtn  of  t  im»  indicated"5  NO 
Enter  t imt  in  months  (if  lets  tnan  1  month,  enter 
1 ) . 

Does  chart  indicate  seventy  of  pam?  NO 
Enter  seventy  of  oam 

3.  Does  cnart  indicate  patient  complained  of  chrome,  excessive 
menstrual  Bleeding?  NO 

1:  tne  length  of  time  indicated"5  NO 

Enter  length  of  excessive  pleeomg  (if  less  than  1 

month,  enter  1 ) 
Does  chart  indicate  length  of  period  (menstrual  flow)?  NO 

Enter  ngmoer  of  days  of  period  (if  a  range  1s  given. 

enter  nignest  numoer). 

«.  Does  cnart  indicate  patient: 

was  treated  medically  for  excessive  menstrual  pieefling?  NO 

Clinical  note  indicates  failed  medical  therapy?  NO 

Had  at  least  one  D*C  NO 

was  it  for  excessive  menstrual  pleeomg?  NO 
was  enoometnai  oiODSy.  vaora  aspiration,  or 

enoometrial  curettage  oone"'  VES 

Had  PaD  smear  wltmn  last  12  months?  YES 

Is  sexually  active?  NO 

Is  taking  contraceptives'  NO 

Has  an  1UD?  NO 

Is  pregnant?  NO 

Gravity?  03 
Parity.? 

5.  Does  chart  indicate  that  pre-op  patient  nad: 

Prolapsed  uterus?  NO 

Ruptured  uterus?  NO 

Stress  incontinence?  NO 

6.  was  a  leparoseopy  performed?  NO 

7.  Pos t -opera t ive  diagnosis  in  cnart  is: 

Stateo?  YES 

Cancer,  cervix?  NO 

Cancer,  enoometMum?  yes 

Cancer,  ovarian?  NO 

Pioroids  ( lelomyomata)?  NO 

Tuoo-ovar tan  aoscess'  NO 

Prolapsed  uterus?  NO 

Ruotureo  uterus?  NO 

Recurrent  PID  with  cnromc  pain'  n<"> 

Excessive  uterine  (menstrual)  pieeding?  no 

Endometriosis?  NO 

Aoanomyosi s?  NO 

Cnromc  pelvic  o»m?  NO 

8.  was  there  a  repair  of  vaginal  prolapse7  NO 
If  patients  pest-oo  diagnosis  1s  cancer,  what  1s  stage? 

10.  If  patient's  post-op  diagnosis  is  fioroias,  what  is  size  of 
uterus  recoraeo  m  climcai  notes: 

Stated'  NO 

<  6  weeKs?  NO 

6-8  weeks?  NO 

B-12  weeKS'  NO 

13-16  weeKs?  NO 

>  16  NO 

11.  Pathology  report: 
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FIGURE  2 


•  ••••  CONcID£N'TI  AJ.  OUALITY  ASSURANCE   WORKING  «>APER  •••»•     PAGE :  * 
BATE  PREPARED:  1987  DOD  PEER  RE  VIE*  D03S1020 

INDIVIDUAL  PATIENT  ABSTRACT 
MTF :  REVIEW  PERIOD: 

PATIENT  SSN:                           PMP:  30  SPOUSE /X- SPOUSE  OIS  DT : 

TASK:  36  HYSTERECTOMIES 

Rooort  1ft  cnart?  NO 

Result   m  cnsrt?  YES 

Ovarian  cancer?  NO 

EnaometncT  c»n:»r?  YES 

Cervical  cancer?  NO 

Fiproias  (leiomyomata)?  NO 

Tuoo~ov»p i*n  aDscess?  NO 
Ruet ure? 

Cervical  neoplasm?  NO 

InvMlv*'  NO 

Non-irwat 1 va?  NO 

EnoometMosU?  NO 

Aoenornyosis?  NO 

Normal?  NO 

Enter  walgnt  of  uterus  (gramj — OOO)  .  188 
If  Datft  raDort  moicates  cancer .  wnat  is  grace? 

13.  Wat  trie  oatn  result  of  any  cervical  Biopsy  ivsiliDU?  w© 

miio  dysplasia?  NO 

Mocaratt  dysplasia?  NO 

Severe  dysplasia?  NO 

Carcinoma  in'gUu?  NO 

Invasiva  carcinoma?  NO 

14.  was  oat  lent  returnee  to  operating  room  wunm  10  oaya  of  < 
oroceoura  during  stay?  mo  i 

15.  Does  cnari  or  oDerat 1 ve  report  indicate  mora  tnan  uterus.  * 
ovaries,  ano  tupes  were  removed?  NO 

16.  was  time  from  fuii  anesthetic  induction  to  leaving  OR: 

<  3  nou-s?                                                             ,  YES 

>"  3  nours?  NO 

17.  was  pregnancy  test  performed  Burma  a  omission?  NO 

was  pregnancy  test  positive?  NO 

IB.  Enter  aomission  nemoglooin  (witnm  24  nours  DMor  to  or 

after  aomission).  13 

19.  010  Patient  receive  2  or  nor®  units  of  OlooO  ouMng  or 

following  tne  procedure?  NO 


9-10 


UJ 

o 
e 
< 
X 
u 


UJ 

ZD 
O 

u- 


UJ 

a 
< 
a 

SUJ  9 

>*° 
UJUJUJ 
OffU 
2  «* 

4  ec> 

KlUK 
O  UJ  UJ 

in 

<  O  V 

e 

>o< 

W  Q  M 


4 

8 


z 

g 

O 

u 


M 
O 
M 
> 

s 


\0  (M  N 
\0  sO 


a 


CO 


f-\  CM  C\J 


w 

CJ 
M 

> 

s 

CO 


CM  C£  CO 

\o  vn  u> 


mi 

4  CO  QO 

H  »r\  J-  5 

O  r*  "H  «H 


e 
o 


> 

Z 

o 

O 

UJ 

t- 

«n 

> 

u 

UJ 

:hol 

u 

n 

t- 

< 

o 

z 

«/) 

O 

4 

c 

»- 

2 
O 


e 
u 

s 

z 

UJ 


w 
< 


m 


e 

a  u  4  m 
f-    <    «  >  o 

OCX  2X2 
4  »-  t-  Ovool  »- 
a    in    —    tz  —  s<o 

*-        O        mi        UJt-  O  C  Z 

•n  4  UJ  o  4 »-  o  — 
CD  J  to  Q  U  0.  O  u. 

<     >00-<-S — 

oultn  jq>  OS 
>  uj  t-u<ffl2«n<< 
to  _j  o  <  v-«    «    ee  x 
Ouiuinlj    u.  uj 

D«h««<h  JU02 
uj  J<D»-»<h     O  -J 
a-oz-juo  < 
uj  <  —  •■«►-  o  2trwu 


ee  _ 
4  — 
z  z 

UJUU 
02< 


2<«0>00S^J(fl 
U)      >  UJ  U  O      CE  W  O  > 

</)  SwiisujMHZ 
«nt-»/)3_j_)Q»-xma. 
uiaujinOQSuj 

<  z  <  a  u  u  t-  2  2  z  2 
uuua  i  i  i  n.  s  i  i 


x 

M  U 

4  < 

«  Z 
X  z 

o»-  X 


•J  < 

UJ  & 
_l  uo 

S«o 
COZ 
u  < 

-  H  4 

t-  «  u 

w  J  — 

h.  O  Q 

So12. 

u  o  _ 

uj  uj  n 

8-1  3 
o  m 

X  X  uj 
UU  c 
i  i  a 


a  a  a 
a  cc  ec 

X  X  X 

K  h»  K 

«  <  4 
o.  a  a 


4 


UJ 

mj  in 
o  - 

X  to 

U  4 

WX2 
«  t-  O 

H  M 
I-JK 
t-  K  O  < 

a  >  o  •» 
uoo 
-J  uj  uj  Z 
c  _j  _)  •-< 
ZOO 

ciio 

OOU2 
Z  l   i  i 


CO 


en 

4 


-I  CO 

o  « 
X  m 
U4 


eft  1 1- 
~<  t-  O 

x  "  51 1 
i-  to  y  ►» 
4  >  O 
6.  U  O 
UJ  UJ 

e  _>  -i 


o 

o  u 

UJ 

e  o 

2 


4  UJ 


O  i  i 


<  t- 

0,  2 

10  UJ 
CIO 
UJ  H-  w 
Z  4  U 
U  Z 


CM  • 

CO 

UJ 

»-  UJ 
4  H- 

a  4 
«  a 

•f  > 

H  2 

UJ  M  o 

»-  _J  «- 

<  CD  H 

e  4  4 

^  M  M 

lO  4  O 
4  >  2 

H-  <  « 


10 

a 

UJ  z 

4  lO 
•  •  C  m 

Z  Z 
UJ  uj  (- 

»-  t-  4  O 

4  4  a  j 
c  ee  uj 

_  -J  M 
xz  4  U. 
t-UJ 

4  i-  tr  > 

UJ  4  O  UJ 
O  Z  Z  X 


—  —  —  —  —  —  —  --  -cNCNCNCNCNCNCNCNCNCNnn  nnn 


in 


•»  —  CO  o>. 

CO  CX  «  U5  V 
CN  •- 


9-11 


(J1  ACJNJ  - 


—  (s> 

ft  91  &  M  U 




UUUU 
•    •    •  • 

22SO 

m  o  >  tn 
•<  a  -« > 

3  o  h 
"n  »  I  z 

r~ 

m  as  a 
p  a  >  > 

-4  mm 
S  X 

w  a  •  • 


O  «0  OB  -4  O 

• 

*■>>-«  »-0 

2  <»  2  m 

i—  tM  X  M  -4 

nr  ox 

-»2»  z 
fc«  r"  -4  h 

-4  r 
<  -» 

9      •  J> 

»  a  a 

-*  >  a 


UUUU 
«    •    i  • 

OliO 

a  r  g  a 

m  m 
■tool 

-4  Pi  m  -4 


KJ  M  -  -  -  -  — 


2DO§ 
o  x  i  a 
OO  2 
»-  r-  F  5 
2  m  m  p 
o  a  o 

M  Q  .<  73 

ootn  > 

>0>H-4 


a  i  i  a  a  a 
3  n  o  >  >  > 
fn  x  x  h 
wopxn 

2  m  m  a  a  a 
o  n  a  a  a 

H  Q  ■<  -t  H  «H 

z  ow 

OO-iSmq 


mmH     O      -4  H  2 

5!t    S  s-*«-nnm 


2  m 
o 

0) 


m 

► 

>  M 


m 
o 

5 


O  -(  « 

X  in 

S  S  n 


Si 


t/i 

M 


O  -I 
Z  Xin 
«>. 

y>  x 

sl 

m 
r 


in 

M 

in 


_>nza 
o  «  a 

in  r-  -4 
a  m  x 
>    r  m 


'  '  '  ~  '  I  I  tjooo 
ZZZ2-<OOaB>X» 
OOOOCXXminxn 

iH-maitnij  cn 
-<  O  on  a    o  O  m  <  m 

H-mo»^n<n  r  <  h 
<">  m  a  2  g  -t  <-«-  i>  m 
>    •<  J>OP«20«3J 

rn    a    i~  x  tn  n  m  <jj 
x  a    •«•    i-'v  >oro 
j>>i/»ro>n^  m< 
so<Of~(/»xfn«a 


m  o  a  o  o  im 


s 


•S  O  ~*  >  p  ~  m    &  t/» 
'  irn 


X 

2 
> 


x  x 


in 
n  <-> 
X  in 


2  a  o  h  ffi  r 
o  >  s  *»  a  <-> 

z 

in 


CD 

«/>  a 

-4  t» 

x    a  o 

<-•     *»  -4 

>    n  o 

m  h  a 
«  © 

<n  a 


m 
2 


o 
m 

m 
n 
a 


in  ~t 
m  & 
a  w» 
<  X 

n 

m  to 


?*x 


m 
n 
< 
in 

-4 

m 
o 
■h 
a 
s 


a 
c 
z 
a 
> 

m 


>-»  —»  i-»  N) 


H>       l-»  V*» 


Vn 


9 

o 
m 

z 


3  § 
?  ? 

nQh 
H  H 

>o< 
a 

<o> 

m 

v»  a  in 
m  m  c 
a  m  a 
<  a  » 

1M  Z 

nan 
ni  ni  fn 
< 

3  § 


a 

t> 
a 
m 
a 


O 

c 


Vi 


03 


so 


in 
o 

I 
a 
o 
m 


9-12 


ui 
o 
e 
c 

0 


u. 

to 

z 
* 
< 


e 

UJ 

o 

> 
o 
c 
c 


al 

a 

a. 


uj  Z 
-O 
>  *- 

=  ^ 
UJ  B 
UJ  < 
C 

o> 
cc 

:1 

J 

ui 
e 
a. 

u. 


o 


o 
o 
o 


CJ  «i  «-< 


-I 
< 


> 

3 

O 

to 

uu 

Ui  o 

•-  o 
> 

111  M 

•JZ 

2?. 


~  U4 
to 

a   >-  z 


o 


a 

u 
«/» 

UI 

o 
s 

Ui 


tr    —  \r 

O    u    <  vr 

-  <  r  >a 

U     c     X        r  ..2-2 

<         >-         to  Dl/tt- 

K     <Ji  C»!<0 

to    to    j  iii^oa: 

v»     <     ui     Q  <  'J 

mulm jd>05 
>UI  i-U«B2i/l<< 


IT 

—  O 

<//  — 

<  c 
Z 

o-  - 

M  to 

111  -J  < 

_J  »/> 

o  — o 
tr  2  u  « 


_  CI 


1/ 


IT 
C 


(DuJOWlX-J     U.      UI  Ilinl'- 

D«a  —  ""to»-_Juj02  uiuu*-w< 


a-o2-JO< 
uj  <  —  —  *-  5  z 

UD 

ui  ui  a 

lII3 


_j«n 

u  < 
"»»  to 
trIZ 
—  -  O 


UJ  X 

o« 

HA  OS 

U«uJ 

-to© 


o 

to 

(J       •  « 

UJ     «•  «e 
DO  O 

z    ui         ui  z 

UI      to  UJ  to  to 

a.  <  to  « to 
a  !«• • asfl 
<  •     a  v  m 

-  >  IS 
J     to  Z  uj  uj 
<UJ  —  O'-'-<0 
toto_j  —  «s  «  a  _i 
Z  <  is  to  a  oc  ui 


OJZ     «-  »-  «  u 
>     '02<-j-I-J»-     —  -I  • 

to>j    wyu  -  ouj—         muz  <  ifluy-vny1- 
z<<o>QOS^Ztfi*-^fc->'5  — a>o»-oo< 
ui    >uiuo    stno>a&auo  uooauoa 
i/>    £  uj  u;  tt  uj  *;  i-  i  tc  tc  a  uj  uj  5  _»  uj  uj  Z    uj  ui 

U!  CC  Ui  V) 

UJ  _ 

<  X  <  ecu 
uuua 


9-13 


PROCESS  MEASURES 

The  development  of  explicit  criteria  for  quality  of 
care  among  elderly  persons:  the  National  Medicare 
Competition  Evaluation 

*  The  most  important  issues  in  deciding 
between  generic  and  tracer  methodologies 
revolve  around  the  beneficiaries.  A 
thorough  study  of  quality  of  care  for  the 
elderly  requires  both?  tracer  methods  to 
study  resource-intensive  conditions,  and 
generic  methods  to  assess  overall  quality. 

*  Key  considerations  in  selecting  conditions 
for  study  include  broad  applicability , 
relative  ease  of  measurement,  manageable 
sample  size,  and  incidence  rate. 

*  It  is  especially  important  to  link  these 
process  measures  to  outcomes  in  studying  the 
elderly.  Failure  to  do  so  can  yield  false 
effects  and  biased  findings. 


By  Sheldon  Retchin,  M.D. 

Although  my  presentation  will  focus  exclusively  on  the 
problems  of  the  elderly,  through  my  involvement  with  the 
National  Medicare  Competition  Evaluation  (NMCE) ,  I  believe 
that  the  general  issues  raised  will  be  applicable  within  a 
broader  context.     I  plan  to  address  three  questions: 

*  What  are  the  most  important  issues  in  deciding  between 
generic  (i.e.,  comprehensive  review  pf  overall  care) 
and  tracer  methodologies  to  assess  quality  of  care  for 
the  elderly? 

*  For  both  methodologies,  which  medical  conditions  are 
most  feasible  for  studying  the  quality  of  care  for  the 
elderly? 

*  How  relevant  are  conclusions  based  on  process  methods 
unless  they  include  measures  of  outcome? 

To  begin  with  some  background,  the  NMCE  is  designed  to 
assess  the  impact  of  alternative  health  plans  such  as  HMOs 
in  implementing  risk-sharing  health  care  programs  for 
Medicare  beneficiaries.  To  study  the  influence  of  the 
plans,  we  identified  several  potential  harmful  effects  on 
the  elderly. 
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First,  HMOs  primarily  effect  reductions  in  cost  by  reducing 
utilization  of  the  rate  and  duration  of  hospitalizations. 
The  elderly,  with  their  high  rates  of  disability  and 
chronic  disease,  are  particularly  vulnerable  to  reductions 
in  access  to  these  services.  Many  of  them  are  on  fixed 
incomes,  and  they  may  find  the  new  cost  constraints  in 
these  plans  so  frustrating  and  dissatisfying  that  they 
disenroll.  In  addition,  the  common  practice  of  rationing 
the  time  and  effort  of  primary  care  physicians  may  make 
continuity  of  care  difficult  to  achieve  for  the  frail 
elderly,  who  are  most  likely  to  feel  the  bad  effects  of 
discontinuous  care. 

Therefore,  the  NMCE  adopted  a  multi-dimensional  approach 
that  focused  on  an  analysis  of  care-seeking  behavior  and 
response  time  of  providers,  an  assessment  of  health  status 
outcomes,  a  comprehensive  evaluation  of  outpatient  care,  a 
review  of  the  process  of  care  for  two  tracer  conditions,  a 
description  of  quality  assurance  programs  that  were 
instituted  within  the  plans  over  the  time  of  the  study,  and 
an  evaluation  of  patient  satisfaction.  We  are  now  in  the 
fourth  and  final  year  of  the  program. 


Generic  vs.  Tracer  Methods 

In  many  respects,  the  differences  between  using  generic  and 
tracer  methodologies  pivot  around  two  perspectives:  the 
individual  perspective  and  the  group  perspective.  The 
elderly  individual  is  concerned,  among  other  things,  about 
the  effects  of  limited  access  for  catastrophic  or  unusual 
events,  and  therefore  studies  using  this  perspective  must 
include  resource-intensive  conditions  more  easily  studied 
through  tracer  methods.  The  group,  on  the  other  hand,  is 
concerned  about  the  medical  commons.  Thus  studies  from  the 
group  perspective  must  include  an  assessment  of  the  overall 
quality  of  care  that  is  best  evaluated  by  generic  methods. 

The  selection  of  items  for  a  comprehensive  or  generic 
review  should  be  based  on  medical  illnesses  that  are  highly 
prevalent  and  representative  of  the  group  overall.  Tracer 
methodology  should  be  utilized  for  illnesses  that,  while 
not  necessarily  frequent,  make  heavy  use  of  resources. 
Since  both  issues  were  particularly  germane  to  the  care  of 
the  elderly,  the  NMCE  used  both  approaches. 


For  generic  study,  we  chose  to  review  both  routine 
ambulatory  care  (evaluated  by  initial  assessment  and 
follow-up  of  unexpectedly  abnormal  findings)  and  the 
diagnosis  and  management  of  hypertension  and  diabetes — both 
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relatively  prevalent  chronic  illnesses.  For  our  resource- 
intensive  tracer  conditions,  we  chose  colo-rectal  cancer 
and  congestive  heart  failure. 


Selecting  Conditions  for  Study 

For  purposes  of  generic  review,  some  medical  conditions 
simply  did  not  occur  frequently  enough  to  be  included, 
especially  since  this  method  is  intended  to  reflect  overall 
quality  of  care.  The  data  in  Figure  1  come  from  the 
Resource  Data  Book  published  by  the  National  Institute  on 
Aging,  and  represent  three  population-based  studies  in  New 
Haven,  Iowa,  and  East  Boston.  To  take  one  example: 
significant  cognitive  impairment,  while  highly  relevant  to 
the  care  of  the  elderly,  is  present  in  only  5%  of  the 
population.  Moreover,  treatment  outcomes  are  not  highly 
successful,  and  so  would  not  be  expected  to  differ  between 
risk-sharing  plans  and  fee- for- service  plans. 

There  are  also  important  sexual  differences  in  the 
prevalence  of  certain  conditions  in  the  elderly  which  could 
make  the  data  less  representative  of  overall  quality.  For 
example,  although  osteoporosis  is  a  very  relevant  condition 
in  terms  of  its  utilization  of  resources,  males  are  much 
less  affected  than  females.  The  same  may  be  said  for 
depression.  Also  in  Figure  1  are  several  opportunities  for 
study  that  have  been  largely  ignored,  such  as  weight  loss 
and  hearing  deficits. 

Turning  to  tracer  methodology,  Figure  2  shows 
hospitalization  rates  for  different  conditions,  from  the 
National  Health  Survey  for  1983.  It  follows  from  my 
previous  remarks  that  we  should  be  very  concerned  about  the 
specificity  and  uniformity  of  tracer  conditions.  Thus  we 
selected  colo-rectal  cancer  and  congestive  heart  failure 
because  these  two  conditions  occurred  with  high  enough 
incidence  rates,  were  applicable  to  both  sexes,  utilized 
intensive  amounts  of  resources  and  efforts,  and  were 
specific  enough  diagnoses  to  be  amenable  to  the  development 
of  explicit  process  of  care  criteria.  We  could  eliminate 
other  diagnoses  on  the  basis  of  low  incidence  rates,  such 
as  hip  fracture,  or  specificity,  such  as  dementia,  or  poor 
applicability,  such  as  prostate  cancer. 


Other  Feasibility  Considerations 

A  related  element  to  consider  for  feasibility  is  sample 
size.  In  this  regard  the  elderly  may  be  unique,  because 
geriatric  medicine  is  still  in  its  relative  infancy  with 
regard   to  the   development   of   certain  clinical  standards. 


10-3 


Figure  3  provides  an  example.  The  status  quo  compliance 
rate  for  the  recommended  management  of  the  hypertensive 
elderly  is  at  least  as  high  as  70%.  If  we  accept  a 
relative  difference  of  15%  between  comparison  groups,  we 
would  need  almost  2,000  individuals  in  each  group  to  have  a 
90%  chance  of  detecting  an  important  difference.  On  the 
other  hand,  since  the  status  quo  for  the  other  problems 
shown  is  less  established,  the  sample  size  required  for 
similar  relative  differences  is  substantially  lower.  Thus 
the  study  of  quality  of  care  issues  for  these  particular 
disorders  may  not  only  be  highly  efficient,  but  may  also 
contribute  significantly  to  innovations  and  improvements  in 
the  quality  of  care  for  the  elderly. 

Since  the  marginal  cost  for  data  collection  for  both  tracer 
and  generic  methodologies  can  differ  widely,  efficiency  and 
sample  size  are  important  fiscal  issues  for  determining  the 
project's  feasibility.  The  data  in  Figure  4  are  taken  from 
the  NMCE,  which  has  used  nurse  abstractors  exclusively.  In 
this  chart,  BC  stands  for  basic  (i.e.,  generic)  care,  and 
RI  stands  for  resource-intensive  care.  As  you  can  see, 
there  are  wide  differences:  more  than  a  four-fold 
difference  in  the  cost  per  case  of  locating  patient 
records,  and  a  two-fold  difference  in  record  abstraction 
costs.  Moreover,  these  figures  do  not  include  fixed  costs 
of  abstractor  supervision  and  the  like. 


Is  Process  Valid  without  Outcome? 

The  final  question  to  consider  is:  How  relevant  are 
conclusions  based  on  process  methods,  unless  they  include 
measures  of  outcome?  This  is  a  particularly  relevant 
question  for  the  elderly  for  several  reasons,  all  of  which 
may  lead  investigators  to  conclude  from  their  findings  that 
the  process  of  care  for  the  elderly  is  worse  even  in  cases 
where  it  is,  in  fact,  better. 

First,  since  the  prevalence  of  disability  increases  if  the 
elderly  live  longer,  non-linkage  of  process  with  outcome 
may  lead  to  spurious  assumptions  due  to  a  "survival 
effect".  Second,  since  functional  disability  is  more 
likely  to  be  detected  and  documented  when  medical  care  is 
better,  non-linkage  may  lead  to  "discovery  effect". 
Finally,  if  the  sampling  frame  for  tracer  conditions  comes 
from  inpatient  records  alone,  variation  in  ambulatory 
detection  rates  may  be  missed  and  outcomes  may  be  biased. 

In  choosing  specific  outcomes,  the  precision  of  acute 
physiologic  data  and  the  like  is  certainly  seductive,  but 
these  systems  may  not  be  applicable  to  the  vast  majority  of 
beneficiaries.     The  elderly  are  a  heterogeneous  population, 
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and  the  vast  majority  of  them  are  functional.  You  would 
therefore  need  an  instrument  that  can  pick  up  very  low 
rates  of  dysfunction.  You  would  also  need  an  instrument 
that  is  valid  across  a  broad  spectrum  of  beneficiaries — 
from  the  robustly  healthy  to  the  very  frail.  There  is 
difficulty  in  designing  instruments  with  that  type  of 
precision  and  scope,  and  I  do  not  think  we  should  be 
seduced  into  substituting  certain  types  of  data  that  on  the 
surface  are  more  precise,  but  would  not  describe  the  great 
majority  of  the  population. 


Summary 

The  most  important  issues  for  consideration  in  deciding 
between  generic  and  tracer  methodologies  revolve  around  the 
beneficiaries.  For  groups  like  the  elderly,  resource- 
intensive  conditions  must  be  studied  through  tracer 
methods.  When  selecting  referent  conditions  for  study,  the 
feasibility  of  study  will  depend  on  sample  size.  Sample 
size,  in  turn,  is  greatly  affected  by  incidence  rates  and 
the  status  quo  compliance  with  established  or  proposed 
standards . 

Finally,  the  linkage  of  process  with  outcomes,  while  always 
hypothetical ly  important  for  quality  of  care,  may  be 
particularly  relevant  to  the  elderly,  While  this  linkage 
can  be  difficult  and  expensive,  conclusions  about  quality 
of  care  must  be  based  on  sound  and  rigorous  evidence  for 
this  most  vulnerable  segment  of  our  society. 


*        *  * 

Dr.  Sheldon  Retchin,  a  board-certified  internist,  is 
Associate  Professor  of  Medicine  and  Chairman  of  the 
Division  of  Geriatric  Medicine  in  the  Department  of 
Internal  Medicine  at  the  Medical  College  of  Virginia.  As  a 
Co-Principal  Investigator  for  the  evaluation  of  quality  of 
care  for  the  National  Medicare  Competition  Evaluation,  Dr. 
Retchin  has  been  responsible  for  the  evaluation  design  for 
the  project  and  will  be  the  principal  author  for  the  final 
report  to  the  Health  Care  Financing  Administration. 
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FIGURE  1.   PREVALENCE  OF  CONDITIONS  APPLICABLE  TO 

STUDY  OF  QUALITY  OF  CARE 
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FIGURE  2 


HOSPITALIZATION  RATES  FOR  CONDITIONS  AMONG  ELDERLY 
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FIGURE  4. 


MARGINAL  COSTS  OF  DATA  COLLECTION  FOR  DIFFERENT 
QUALITY  OF  CARE  STRATEGIES 
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Hie  charge  for  this  conference  as  a  whole  is  to  examine 
our  current  state  of  knowledge  with  respect  to  modeling  and 
measuring  quality  of  care  in  providing  health  care  services. 
My  task  -  at  least  by  comparison  -  is  simpler:  to  examine 
the  current  state  of  knowledge  about  the  major 
organizational  determinants  of  quality  of  care.  But  I  can't 
resist  the  opportunity  to  say  a  few  words  first  about  the 
inportance  of  the  answer  to  the  question,,    "What  is  quality 
of  care?"  before  anyone  can  begin  to  address  the  "easier" 
question,  "What  organizational  factors  influence  the  quality 
of  care  provided?"   Thus,  I  will  first  provide  an  overview 
of  some  issues  in  quality  of  care  measures  as  they  apply  to 
hospitals  in  particular,0  then  I  will  discuss  some  of  the 
organizational  factors  which  affect  how  a  hospital 
administrator  might  view  quality  of  care  and  pressures  to  be 
accountable ;  and  then  I  will  review  some  of  the  literature 
on  organizational  factors  which  are  related  to  quality  of 
care  in  hospitals. 

In  this  discussion,  I  will  focus  mostly  on  the 
nonfederal  acute  care  hospital,  for  two  reasons.    One  reason 
is  time  limitations  and,  given  a  need  to  make  choices  about 
what  to  include,  I  prefer  to  cover  a  portion  of  the  subject 
in  greater  depth  and  sacrifice  its  full  breadth.    The  second 
is  a  related  but  more  substantively  important  rationale: 
primary  care  (ambulatory  care) ,  federally  run  institutions, 
and  long-term  or  mental  institutions  have  some  important 
differences  from  acute  care  hospitals  with  respect  to  their 
goals,  tasks,  services,  and  organizational  contexts  as  well 
as  to  their  patients,  physicians,  and  third  party  payers; 
these  differences  can  lead  to  different  organizational 
processes  affecting  performance,  as  measured  by  the  costs, 
the  satisfaction  of  clients,  and  the  quality  of  care.  For 
example,  the  importance  of  the  skill  level  and  intensity  of 
interactions  with  nurses  may  have  very  different 
implications  for  cost-effectiveness  and  satisfaction  in 
these  different  settings.  Thus,  in  order  to  understand  the 
relationship  of  organizational  features  to  quality  of  care, 
we  need  to  examine  each  of  these  types  separately.    But  a 
full  picture  also  needs  to  include  all  of  these  types  and 
should  examine  their  interrelationships  as  well.  Likewise, 
we  should  examine  the  impact  on  economic  performance  and  its 
relationship  to  quality  of  care.     This  full  picture  is 
probably  beyond  the  scope  of  this  conference,  let  alone  my 
brief  overview,  and  so  I  will  settle  on  a  focus  on  quality 
of  care  in  hospitals,  with  only  a  limited  discussion  of 
costs. 

SPECIAL  CONSIDERATIONS  IN  MEASURING 
QUALITY  OF  CARE  IN  HOSPITALS 
Quality  of  care  can  be  described  as  a  portmanteau 
concept:  it  carries  a  great  many  things,  keeps  them  in  no 
particular  order,  and  does  so  in  a  way  that  conceals  them 
from  view.    There  is  still  much  work  to  be  done  in 
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cxsnceptualizing  and  measuring  quality  of  care  and  devising 
strategies  to  ensure  it,  especially  within  a  context  of  cost 
containment.    Yet  we  have  established  several  widely  agreed 
upon  generalizations: 

First,  while  there  are  important  causal  links  between 
some  structures  and  processes  and  outcomes  in  care,  these 
links  are  complex  and  far  from  deterministic  so  they  cannot 
be  considered  substitutes  for  each  other.  This 
generalization  leads  to  the  conclusion  that  quality 
assurance  based  on  structural  or  process  measures  alone  may 
not  be  sufficient  to  assure  that  patients  are  genuinely 
better  off  after  receiving  care. 

Second,  some  processes  lead  to  greater  efficiencies 
without  effecting  the  quality  of  care;  others  imply 
trade-offs.    This  has  led  to  an  examination  of  the 
non-patient  care  aspects  of  management  apart  from  clinical 
care.  It  has  also  inspired  the  distinction  between 
production  (or  management)  and  clinical  (or  professional) 
efficiencies  and  an  examination  of  their  relationship  to 
quality  of  care  and  their  implications  for  accountability. 

Third,  technical  care  and  humane  care,  though  somewhat 
interrelated,  are  affected  by  different  structures  and 
different  processes.    This  points  to  the  need  to  examine 
more  carefully  the  full  model  of  which  organizational 
structures  and  which  types  of  organizational  processes  lead 
to  more  humane  care  and  to  greater  patient  satisfaction  on 
the  one  hand  and  which  structures  and  processes  lead  to  more 
efficient  or  to  more  effective  technical  quality  on  the 
other. 

In  order  to  measure  and  explain  quality  of  care  in 
hospitals,  we  must  deal  with  the  multiple  dimensions  implied 
by  the  term;  we  must  decide  at  what  level  we  are  trying  to 
examine  quality  of  care  (e.g. ,  the  physician,  the  intensive 
care  unit,  the  hospital,  the  community,  the  state,  or  the 
nation) ;  and  we  must  choose  the  constituencies  whose  special 
interests  will  be  considered  (i.e.  to  whom  is  the 
organization  and/or  professional  accountable?)    Indeed,  the 
choice  of  what  is  quality  of  care  and  how  it  will  be 
measured  is  one  of  the  most  important  choices  for  the 
would-be  evaluator  for  determining  which  organizational 
factors  will  be  relevant  and  why. 

AN  AEMINISTKAIDR-EYE  VIEW  OF 
QUALITY  OF  CARE 

Presumably,  one  of  the  major  reasons  behind  the  call 
for  this  conference  is  to  figure  out  how  to  hold  providers 
accountable  for  the  quality  of  care  provided.    As  a  way  of 
illustrating  the  importance  of  addressing  the  question  of 
who  is  evaluating  the  quality  of  care  in  hospitals  before  we 
can  begin  to  build  a  model  of  how  and  which  organizational 
processes  might  affect  it,    I  am  going  to  take  the 
perspective  of  a  hypothetical  administrator  trying  to  figure 
out  what  tasks  are  included  in  the  judgment  of  poor  or  good 
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quality,  who  are  the  characters  who  are  going  to  be  judging 
the  quality  of  care,  and  what  are  some  of  the  organizational 
processes  and  "rewards"  that  might  be  affected.  Our 
harassed  administrator  might  draw  up  a  list  which  revolves 
around  four  major  insights  about  hospitals  and  quality  of 
care: 

(1)  The  primary  technical  task  of  the  hospital  involves 
processing  people  as  the  "raw  material"  and  product  of  the 
work.    While  organizations  such  as  universities  or  welfare 
departments  also  provide  services  to  people  and,  in  that 
sense,  are  people-processing  organizations  too,  health  care 
involves  more  than  just  providing  services  to  people. 
People  in  this  organization  become  the  "raw  material"  in  a 
more  literal  sense.    That  is,  it  also  involves  -  at  least  in 
part  -  the  application  of  the  medical  model,  which  treats 
the  body  as  a  biological  system  to  be  examined  for 
pathological  problems  and  to  be  treated  to  alleviate  or 
eliminate  such  problems.    This  body,  which  may  be  subjected 
to  invasive  inspections  or  procedures,  is  of  course  also 
connected  to  a  person  who  must  agree  to  submit  to  such 
activities  and  who  holds  values  and  expectations  regarding 
his  role  as  "raw  material"  and  his  judgments  on  the  success 
of  the  "product"  and  the  adequacy  of  the  clinical  services 
provided.  In  short,  in  the  hospital  we  have  raw  material 
which  can  talk  back  and  which  has  a  strong  vested  interest 
in  the  process  and  outcome  of  the  work  being  carried  out. 

Furthermore,  the  professionals  performing  the  work 
and  the  raw  material  are  likely  to  differ  in  their 
evaluation  of  the  processes.    This  happens  in  part  because 
the  bases  on  which  the  patient  judges  the  process  and 
outcome  are  more  limited  -  both  in  terms  of  the  amount  of 
information  available  and  the  expertise  by  which  to  judge 
the  technical  care.    Thus  the  patient  may  choose  to  focus  on 
aspects  of  technical  care  which  are  relatively  easy  for  him 
to  judge,  e.g. ,  the  interaction  with  the  professionals  or 
the  intensity  and  type  of  services  received. 

In  addition  to  participating  in  the  clinical  or 
technical  services  as  both  raw  material  and  evaluator,  the 
patient  also  judges  the  organization  on  the  non-technical , 
"hotel"  services  provided  such  as  the  food  and  the 
facilities,  the  courtesy  of  the  staff,  the  amenities,  the 
extent  of  privacy  and  so  on. 

Researchers  and  providers  are  prone  to  consider  the 
technical  care  more  important  under  all  circumstances.  But 
for  the  administrator  also  concerned  with  satisfied 
"customers",  high  quality  technical  care  is  certainly  no 
guarantee  for  producing  satisfied  patients.  Recent  work, 
focused  primarily  on  outpatient  care,  suggests  that  the 
quality  of  non-technical  services  as  well  as  costs  (in  terms 
of  time  and  money)    will  loom  relatively  larger  in 
importance  for  patients,  the  more  that  they  take  the  quality 
of  the  technical  services  for  granted.    For  example, 
patients  may  use  costs  or  convenience  as  a  basis  for 
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choosing  between  hospitals,  assuming  that  quality  of 
technical  care  -  or  at  least  their  final  outcome  -  will  be 
essentially  the  same  in  any  of  the  hospitals  that  they  are 
choosing  among. 

In  addition  to  providing  specific  services  to  the 
individual  patient  and  his  family,  hospitals  are 
increasingly  involved  in  appealing  to  organized  groups 
representing  patients.  In  the  past  these  groups  have 
primarily  been  fee-for-service  third  party  payers,  but  new 
types  of  prepaid  or  discounted  arrangements  (such  as  PPOs 
and  HMOs)  are  becoming  increasingly  important.    We  will 
return  to  this  group  in  a  few  minutes. 

(2)  Some  of  the  work  is  performed  by  professionals. 
Part  and  parcel  with  the  notion  that  professionals  are 
performing  the  work  is  the  assumption  that  the  work  is 
relatively  more  difficult  and  complex  and  requires  special 
training  to  perform.    But  not  all  of  the  tasks  performed  for 
patients  require  special  expertise,  and  sometimes  the 
difficulty  of  a  task  is  due  to  the  complexity  of  the  number 
of  employees  or  subunits  involved  to  accomplish  it  or  to  the 
need  for  the  employees  to  be  able  to  provide  services  to  a 
large  number  of  patients  with  differing  needs.  In 
constructing  mechanisms  to  coordinate  and  control  this 
complex  set  of  services  for  the  patients,  the  administrator 
must  also  consider  the  needs  and  expectations  of  the 
employees  and  the  professionals  in  particular.  Sometimes 
these  organizational  adaptations  are  seen  as  concessions  to 
prima  donnas.    That  is,  similar  to  our  views  of  the  artist 
or  dancer,  while  we  value  highly  the  talents  and  work  of  the 
professional  in  our  society  on  the  one  hand,  we  also 
secretly  harbor  a  strong  belief  that  such  people  can  be  a 
real  pain  in  the  neck,  especially  in  an  organizational 
setting   -  always  questioning  rules  and  making  egocentric 
demands  with  little  appreciation  of  the  bigger  picture  of 
the  hospital's  needs  and  obligations.    Others  have  argued 
that  some  organizational  arrangements  which  are  designed 
around  professional  needs  -  for  example,  for  evaluation  by 
peer  review,  autonomy  in  clinical  decisions,  and  being 
"captain  of  the  team"  when  other  workers  are  involved  -  are 
important  concessions  to  the  legal  and  professional 
responsibilities  implied  in  the  work.     (See  Scott  1965  and 
Hall  1967  for  a  discussion  of  three  basic  organizational 
forms  for  professionals  to  become  embedded  as  employees  in 
organizations:  as  autonomous  and  heteronomous  professional 
groups  and  as  professionals  in  departments;  in  a  typical 
hospital  all  three  types  occur:  the  medical  staff 
organization  is  an  autonomous  group  with  respect  to  the 
administration;  the  nursing  staff  organization  is 
heteronomous ;  and  ancillary  departments  such  as  the 
laboratories  involve  professionals  in  departments.  More 
recently,  Shortell  (1983)  has  argued  that  medical  staff 
organizations  "share"  authority  with  hospital  administration 
rather  than  represent  a  second  and  "autonomous"  group  within 


11-5 


the  hospital.) 

For  the  administrator  trying  to  attract  and  retain 
the  physician  as  an  "employee" ,  it  is  perhaps  even  more 
important  to  recognize  that  physicians  providing  direct 
patient  care  in  hospitals  are  atypical  employees  because  the 
hospital  probably  does  not  pay  them  for  their  work; 
moreover,  many  of  these  "employees"  are  likely  to  be  working 
parttime  concuzrently  in  a  competing  hospital  across  town. 
Health  care  organizations  adapt  to  these  unusual  staff 
members  with  incentives  that  cannot  be  directly  reflected  in 
salary  or  bonuses.  Instead  they  "reward"  physicians 
primarily  on  the  basis  of  services  provided  to  help  the 
physician  provide  care  to  the  patient  or  on  the  basis  of 
opportunities  which  are  professionally  or  economically 
rewarding.    Many  of  these  rewards  have  implications  for 
quality  of  care  -  some  to  increase  it  and  others  to  decrease 
it.    For  example,  nursing  services  can  vary  by  the  extent  to 
which  they  ease  a  physician's  workload  by  permitting 
standing  orders  for  a  given,  type  of  patient  or  by  providing 
more  patient  education  services;    a  resident  may  perform 
time-Kxmsuming  history  and  physical  work-ups  of  the  patient 
or  may  write  orders  and  carry  out  most  of  the  physician's 
responsibilities  for  the  patient?  speed  of  laboratory 
results  or  efficiency  in  billing  and  medical  record  services 
may  also  aid  the  physician.    Examples  of  professionally 
rewarding  opportunities  in  hospitals  include  providing 
opportunities  for  continuing  education,  t^ching,  research, 
and  association  with  excellent  and  demanding  colleagues  or  a 
prestigious  organization.    Hospitals  can  also  offer  a  group 
of  colleagues  who  share  a  style  of  practicing  medicine  -  a 
sort  of  corporate  culture  -  which  is  compatible  with  the 
physician's  own  preferences  or  can  offer  -  for  those  who 
prefer  it  -  minimal  peer  review,  little  or  no  formal 
credential  review,  and  generally  providing  a  more  relaxed, 
"hands  off"  attitude  towards  its  medical  staff. 

Physicians  become  members  of  the  hospital  via 
receiving  "privileges"  to  treat  patients  as  members  of  the 
medical  staff.    This  relationship  is  technically  between  the 
hospital  and  the  individual  physician.    But,  as  physicians 
are  increasingly  turning  to  corporate  arrangements  such  as 
group  practice,  HMDs,  and  PPOs,  the  hospital  must  seek  to 
attract  and  collaborate  with  the  corporate  body  of 
professionals.  Indeed,  some  of  these  arrangements  preclude 
or  at  least  severely  restrict  physicians  from  practicing  in 
several  competing  hospitals,  forcing  them  to  choose  a  single 
hospital,  for  a  least  some  types  of  patients.  Other 
arrangements  may  differ  in  how  much  the  physician  faces 
personal  financial  risk  or  benefit  from  hospitalization, 
length  of  stay  in  the  hospital,  or  the  services  provided, 
and  whether  these  incentives  work  in  concert  with  or  in 
opposition  to  the  incentives  for  the  organization.  For 
example,  physicians  in  our  local  group  model  HMD  are  paid  on 
a  productivity  formula,  which  means  that  they  personally  get 


.11  -6 


rewarded  on  a  per  service  basis,  although  the  HMD  is  paid  on 
a  per  capita  basis. 

Thus,  hospitals  typically  mast  not  only  provide 
services  to  patients  and  to  physicians,  but  they  must  cater 
to  organized  groups  of  these  constituencies.    These  services 
-  technical  and  hotel  and  support  for  the  physician  -  are 
also  subject  to  meeting  the  expectations  of  the  physicians 
for  satisfactory  performance  and  may  affect  the  ability  of 
the  organization  to  attract  and  retain  high  quality  staff. 
Yet  services  provided  for  the  benefit  of  professionals  are 
not  purchased  by  the  physician  from  the  hospital.  In  this 
sense  physicians  resemble  employees  of  the  organization,  but 
who  perform  the  work  in  return  for  services  instead  of  a 
salary. 

This  discussion  is  limited  to  a  focus  on  direct 
care  physicians  as  professionals  in  the  hospital.  But  it 
should  be  recognized  that  there  are  other  professional 
groups,  such  as  nurses,  who  will  also  make  demands  on  the 
organization  based  on  professional  responsibilities  (such  as 
delivering  quality  care)  and  which  involve  a  different  set 
of  professional  organizations  including  professional  unions. 

(3)  The  services  hospitals  provide  are  paid  by  yet 
another  group.    Not  only  do  physicians  not  pay  for  services 
hospitals  provide  to  them;  much  of  the  services  provided  to 
patients  are  paid  by  third  party  payers  (i.e.,  insurance  and 
government  through  grants  or  fee-for-service  insurance.) 
This  provides  yet  another  constituency  for  our  hospital 
administrator  to  consider  in  meeting  standards,  in  defining 
and  limiting  services,  and  in  constructing  charges  to  cover 
costs  for  the  complex  set  of  patient  and  physician  health 
care  services  and  other  services  such  as   administration , 
hotel,  and  support.     The  nature  of  this  relationship  to 
third  party  payers:  fee-for-service,  prepaid  health 
maintenanoa  arrangements,  prospective  lump-sum  payments  for 
a  given  episode,  preferred  provider,  or  some  complex  mixture 
of  these  -  influences  the  incentives  for  providing  services 
to  patients.    That  is,  providing  more  services  in  the 
hospital  may  or  may  not  mean  an  opportunity  for  the  hospital 
to  receive  greater  reimbursement.    For  example,  in  FPOs  and 
HMDs  and  other  arrangements  for  lump-sum  payments,  the 
incentives  for  the  organizations  at  financial  risk  are  to 
attract  more  patients  or  a  larger  enrolled  population 
without  incurring  high  costs  from  treating  them.    On  the 
other  hand,  as  many  have  observed,  fee-for-service  or 
"cost-plus"  arrangements  tend  to  encourage  a  production 
orientation,  not,  only  to  obtain  more  patients  in  the 
hospital,  but  also  to  provide  more,  and  more  expensive, 
services  than  perhaps  necessary. 

4)  Because  of  the  widely  held  views  regarding  the  right 
to  receive  at  least  minimally  adequate  health  care  and 
because  of  the  legal  and  professional  bases  in  medicine, 
hospitals  and  other  health  care  organizations  are  the 
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subject  of  unusual  scrutiny  and  accountability  to  external 
groups,  in  addition  to  those  with  direct  fiscal  involvement 
in  the  production  process.    Thus,  while  all  organizations 
have  these  problems  to  some  extent,  our  hospital 
administrator  -  concerned  with  how  the  quality  of  the  work 
will  be  defined  and  evaluated  -  must  take  into  account  a 
large  number  of  accrediting  and  licensing  groups  which  are 
external  to  the  hospital  and  independent  of  the  fiscal 
groups  discussed  above. 

These  features  tend  to  make  hospitals  somewhat  unusual 
in  comparison  to  complex  organizations  in  other 
"industries."     It  can  be  misleading  to  overemphasize  their 
dissimilarity,  however.    We  should  also  remember  that  they 
share  the  need  to  gain  enough  resources  over  expenditures 
(whether  or  not  they  are  for-profit)  to  survive;  a  need  for 
effective  control  over  the  quality  and  efficiency  of  work 
processes;  effective  coordination  of  interdependent 
activities;  a  means  to  attract  and  retain  appropriate  staff; 
mechanisms  to  get  decisions  made;  and  a  means  to  effectively 
interact  with  organizations  and  constituencies  outside  of 
the  hospital  per  se  and  to  compete  with  similar 
organizations.  They  are  also  subject  to  many  societal  level 
influences  such  as  the  general  trend  toward  increasing 
accountability  for  organizations  of  all  types  and  the 
increasing  disillusionment  and  distrust  with  professionals 
and  high  status  positions  more  generally.    It  is  to  these 
features  -  and  evidence  that  they  are  important  for  quality 
of  care  in  hospitals  -   that  we  will  return  below. 

ORGANIZATION  AND  QUALITY  OF  CAKE 

First  we  need  to  ask  a  broader  questions    What  evidence 
is  there  that  organizational  determinants  are  important  for 
quality  of  care  at  all  and,  if  so,  just  how  important  are 
they  anyway?    In  the  first  part  of  this  century,  quality  of 
care  was  essentially  viewed  as  the  responsibility  of  each 
individual  professional  who,  through  a  long  socialization 
and  internship  process,  gained  the  skills  and  the  values 
which  ensured  high  quality  of  care.    Thus  quality  assurance 
efforts  were  aimed  at  ensuring  that  medical  students 
mastered  the  basic  material  and  that  established  physicians 
demonstrate  their  credentials  tlrrough  licensing  and 
certification  processes.   Most  assumed  that  hospitals  played 
a  very  limited  role  in  quality  of  care,  so  that  it  was 
necessary  only  to  certify  that  good  "workshop"  conditions 
were  met,  such  as  adequate  safety,  sanitation,  staff  and 
facilities.  The  last  fifteen  years  have  witnessed  a 
resurgence  in  recognizing  the  iitportance  of  organizations 
for  affecting  the  quality  of  care.     Organizations  can  place 
limitations  and  impediments  on  performance;  they  can  monitor 
and  create  incentives  to  produce  better  quality  of  care.  In 
fact,  most  of  this  evidence  suggests  that  the  most  important 
predictor  of  quality  of  care  is  the  organization  where  care 
is  provided,  not  the  qualifications  and  experience  of  the 
physician  performing  the  work  (Flood  and  Scott,  1987  [Chap. 
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8];  Rhee  1977;  Rhee,  Luke,  and  Culverwell  1980).  Freidson 
(1970)  has  long  argued  that  the  professional  organizations 
are  much  more  important  than  the  professionalization  process 
in  determining  quality  of  care.    But  it  took  the  concern 
over  containing  costs  and  the  relative  ease  of  attacking  the 
hospitals  instead  of  physicians  for  us  to  begin  to  fully 
recognize  and  examine  the  role  of  organizations  in  affecting 
quality  of  care. 

This  makes  today's  task  somewhat  easier  because  there 
are  relatively  few  studies  of  organizational  determinants  of 
quality  of  care.    And  even  these  all  too  often  focus  on 
identifying  structural  features  which  are  associated  with 
quality  of  care  rather  than  ferreting  out  the  processes 
whereby  structure  can  affect  quality  of  care. 

ORGANIZATIONAL  FACTORS 

What  are  the  factors  which  explain  why  organization  is 
important?   "Those  advanced  are  primarily  structural  or,  at 
best,  indirect  indicators  of  clinical  process:  major 
differences  in  mission:  type  of  ownership,  profit  or 
non-profit  and  teaching  status;  case  mix  and  volume  of 
similar  cases;  staff  qualifications;  peer  review  and  the 
medical  staff  organization;  management  practices;  and 
memberships  in  multihospital  systems.    (See  Flood  and  Scott, 
1987  [Chap  3]  for  a  more  detailed  review  of  some  of  these 
articles.) 

Hospital  ownership  and  profit  and  non-profit  status. 
With  the  growth  of  for-profit  hospitals,  the  concern  for  the 
plight  of  the  public  patient,  and  the  increased  threat  of 
hospital  closures  across  the  country,  there  is  a  renewed 
interest  in  studying  the  quality  of  care  in  hospitals  with 
different  ownerships  and  profit  status.    The  public 
hospital,  arising  from  the  old  charity  hospitals,  has  been 
studied  more  closely  for  its  efficiency  than  quality  of 
care.    There  is  some  evidence  to  suggest  that  quality  of 
care  is  lower  in  these  hospitals.    The  debate  about 
ownership  has  focused  more  fiercely  around  the  issue  of 
profit.  Marroor  and  his  colleagues  have  asserted  that  the 
ideological  debate  over  the  iirportanoe  of  a  profit  or 
non-profit  organizational  form  has  confused  profit  status 
with  a  rise  in  cxximercial ism  and  a  decline  in 
professionalism  in  medicine,  when  in  fact  these  trends  are 
much  more  general  and  essentially  independent  of  profit 
status  (Marmor  et  al. ,  1986).    Recent  studies  have  found 
that  there  is  little  or  no  difference  in  quality  of  care  in 
profit  and  non-profit  hospitals.    One  study  of  mental  health 
hospitals  would  suggest  that  this  holds  only  when  physicians 
take  an  active  role  in  delivering  care  in  hospitals 
(Schlesinger  and  Dorward,  1984) .  Some  have  argued  that  since 
for-profits  operate  as  well  as  non-profits  and  without  as 
much  public  subsidy,  then  they  are  more  efficient  and 
therefore  to  be  preferred  (Becker  and  Sloan,  1985; 
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Herz linger  and  Krasker,  1987) .  Others  would  suggest  more 
caution  in  this  generalization,  having  found  that 
for-profits  tend  to  behave  differently  in  appetitive 
markets  (e.g. ,  by  providing  less  care  to  the  poor  [Shorten 
et  al„ »  1986]  and  by  avoiding  teaching  and  research 
activities  [Ermann  and  Gabel,  1984].) 

Teaching  Status.    There  have  been  numerous  studies  this 
past  decade  linking  teaching  status  of  hospitals  to  costs, 
but  fewer  studies  investigating  the  relationship  between 
teaching  and  quality  of  care.    There  are  two  important 
complications  to  examining  this  relationship.    The  first  is 
the  necessity  for  taking  into  account  the  special  nature  of 
the  patient  population  before  coitparing  teaching  and 
nonteaching  hospitals.    Most  research  and  reisTtbursement 
mechanisms  are  based  on  the  assumption  that  teaching 
hospitals  tend  to  treat  the  most  difficult  diseases  and  the 
most  advanced  and  complex  problems.    To  the  extent  that  this 
is  true,  measures  based  on  outcomes  or  the  level  of  services 
provided  may  be  biased  toward  finding  poor  quality  of  care 
in  teaching  hospitals,  unless  they  are  adequately  adjusted 
for  case  mix  (cf .  Horn  et  al.  1985) .    The  second  is  the 
necessity  of  taking  into  account  the  special  missions  of  the 
teaching  hospital:  to  provide  a  workshop  for  training  new 
physicians  and  for  developing  and  researching  advanced 
diagnostic  and  treatment  protocols.  Indeed,  when  quality  is 
judged  on  the  basis  of  thoroughness  of  the  clinical  work-up 
and  the  advanced  nature  of  clinical  techniques,  such 
measures  are  likely  to  be  biased  toward  finding  the  highest 
quality  of  care  in  teaching  hospitals  (cf .  Goss  1970) . 

Several  researchers  have  pointed  out  that  not  all 
teaching  hospitals  are  alike  With  respect  to  these  missions, 
i.e.,  with  respect  to  treating  advanced  patients;  with 
respect  to  their  mission  to  provide  advanced  and 
sophisticated  care;  or  in  terms  of  the  extent  of  supervision 
given  to  inexperienced  physiciam-in-training  (cf .  Horn  et 
al.  1985;  Sloan  et  al.  1983;  Flood  and  Scott  1987  [Chap. 
12]) .    Overall,  the  findings  over  the  past  thirty  years  have 
found,  that  teaching  hospitals  with  major  affiliation  with 
medical  schools  tend  to  be  associated,  with  higher  costs  and 
better  quality  of  outcomes  and  more  sophisticated 
techniques,  even  after  taking  patient  mix  into  account  (cf . 
Kohl,  1955;  Trussell  et  al.  1962;  lee  et  al.  1957;  Lipworth 
et  al.  1963;  Cfflft  1970).    Other  teaching  hospitals  do  not 
necessarily  fare  so  well  in  their  comparisons  to  either 
non-teaching  or  major  teaching  hospitals  (see  Flood  and 
Scott  1987  [Chap  12]) .    In  the  few  studies  focused  on  the 
less  technical  side  of  care,  teaching  hospitals  tended  to 
have  patients  who  were  less  satisfied  with  their  care  (see 
Fleming  1981)  and,  in  one  study,  to  be  less  likely  to  be 
"allowed"  to  die,  even  with  little  or  no  likelihood  of 
iirprovement  (Garfoer  et  al.  1984). 

Patient  Case  Mix  and  Volume.    Hospitals  can  to  some 
extent  exercise  control  over  the  mix  and  volume  of  patients, 
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by  controlling  Whether  and  when  they  are  admitted  and  which 
units  and  staff  will  provide  care  to  different  types  of 
patients.    Organizational  theory  would  suggest  that  there  is 
no  best  way  to  organize  to  achieve  quality  care;  instead  the 
appropriate  structures  and  staff  should  depend  upon  the 
difficulty  of  the  case  mix  at  the  hospital.    This  argument 
remains  largely  unexplored  although  the  little  evidence  that 
exists  does  not  support  this  hypothesis  (see  Schoonhoven 
1981) .    The  importance  of  volume  of  cases  fairs  much  better. 
There  is  strong  evidence  from  several  studies  that  the 
greater  the  volume  of  similar  cases  treated  at  the  hospital, 
the  better  the  outcomes  (cf.  Loft  et  al.  1979;  Luft  1980; 
Flood  et  al.  1984a  and  1984b;  Kelly  and  Bellinger  1986; 
Maerki  et  al.  1986;  Sloan  et  al.  1986;  Showstack  et  al. 
1987) .    What  is  less  clear  is  the  explanation:    it  is 
because  the  organization  and  its  staff  become  more  practiced 
in  managing  and  caring  for  these  patients?    (See  Flood  et 
al.  1984c;  Flood  and  Scott  1987  [Chap  12];  and  Sharp  1987.) 
Or  is  it  because  "good"  care  builds  a  reputation  which  then 
begets  more  patients  coming  to  the  hospital?  (See  Dranove 
1984;  luft  et  al.  1985.) 

Staff  Qualifications.  The  evidence  of  the  importance 
of  physician  qualifications  for  assuring  quality  of  care  is 
somewhat  mixed.    As  already  noted,  several  studies  have 
compared  the  importance  of  physician  qualifications  and 
organizations  in  predicting  differences  in  quality  of  care 
and  found  that  organizations  were  more  important.  Older 
studies  tend  to  support  the  importance  of  individual 
qualifications  while  recent  studies  have  found  some  evidence 
that  specialization  and  sticking  to  their  own  domain  of 
practice  tends  to  predict  better  care  in  terms  of  more 
appropriate  hospitalizations  and  uses  of  services.    Much  of 
the  literature  focused  on  nursing  staff  qualifications  does 
not  use  the  same  basis  for  measuring  quality  of  care, 
conceri'trating  instead  on  nursing  care  tasks,  or  using 
structural  Hidicators  of  hospital  quality  (see  Halloran  and 
Kiley  1987) . 

Medical  Staff  Organization.  Several  studies  spanning 
the  past  three  decades  have  found  evidence  that  a  strong 
medical  staff  organization  -  strong  in  terms  of  peer  review, 
selection  and  continued  review  of  new  staff  members,  and 
participation  in  policy  setting  committees  -  tends  to  be 
related  to  better  quality  of  care.    The  specifics  of  how 
these  processes  work  and  whether  the  same  factors  are 
important  for  different  groups  of  patients  (such  as  surgical 
patients  versus  medical  patients)  is  less  well  understood. 

Management  Practices.    There  are  three  broad  aspects  of 
management  which  have  been  found  to  affect  the  quality  of 
care;  coordination  mechanisms,  efficiencies  in  non-medical 
services,  and  power  and  control. 

Coordination.    Several  studies  have  concluded  that 
coordination  mechanisms,  especially  among  nursing  and 
ancillary  units,  affect  quality  of  care.  While  most  forms  of 
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cx»rdination  are  generally  regarded  to  help,  or  at  least  not 
harm,  quality  of  care,  one  form  of  coordination:  formal 
specification  of  procedures  for  job  performance  and 
sequencing  -  has  been  argued  to  be  harmful.    Both  the 
professional  model  and  the  entrepreneurial  model  argue  that 
specification  of  work  is  harmful  when  the  tasks  are  complex 
such  as  for  medical  care.    While  some  studies  would  support 
these  contentions  (finding  that  restrictions  on  drugs 
available,  poor  medical  record  keeping,  and  highly  specified 
clinical,  procedures  were  all  associated  with  poorer  care) , 
others  have  found  no  support.  Still  others  report  contrary 
findings:  more  explicit  policies  regarding  nursing  care  were 
associated  with  better  care.    In  trying  to  explain  these 
confused  and  confusing  results,  there  is  a  general  consensus 
that  the  tests  to  date  regarding  the  importance  and  role  of 
coordination  have  been  too  simplistic.  Formalized 
coordination  is  not  necessarily  incompatible  with  the 
flexible  decision  making  needed  in  clinical  care. 

Managerial  efficiency  in  nonmedical  tasks.  Shortell, 
Becker,  and  Neuhauser  (1976)  found  evidence  that  greater 
specification  for  nonmedical  tasks   was  related  to  more 
efficient  production  and  in  turn  to  better  quality  of  care. 

Control  Mechanisms  and  the  Visibility  of  Consequences. 
Control  over  physicians  by  the  medical  staff  has  already 
been  covered.    Other  control  rcedhanisms  tend  to  present  a 
very  confused  story.  There  appears  to  be  no  measure  of 
administrative  influence,  encaxachment  of  power,  or 
involvement  by  physicians  in  administrative  or  policy 
setting  cscmmittees  which  lead  consistently  to  better  quality 
of  care.  When  administrative  tasks  were  given  greater 
scrutiny  (or  were  more  visible  to  superiors) ,  the 
non-medical  aspects  of  the  hospital  appeared  to  perform 
better  but  there  was  no  evidence  of  a  direct  link  to  quality 
of  care. 

Membership  in  a  Multihospital  System.  There  is  great 
interest  in  whether  hospitals  in  systems  behave  more 
efficiently  or  effectively.    Despite  this  great  interest, 
there  are  few  studies  of  quality  of  care  in  systems,  due 
largely  to  the  great  variety  of  types  of  systems;  the 
diversity  of  hospitals  included  in  systems  (both  within  and 
across  systems) ?  and  the  difficulty  of  assessing  quality  of 
care  at  a  system  level.    Nonetheless,  there  is  sane  evidence 
to  suggest  that  quality  of  care  can  be  improved  by  system 
membership.    Most  of  this  evidence  is  rather  irdirect  - 
improved  availability  of  patient  services  and  more  qualified 
personnel  than  otherwise  available.    There  is  also  some 
evidence  to  suggest  that  system  hospitals  are  more  likely  to 
have  increased  job  dissatisfaction  and  friction  and  less 
likely  to  have  staff  who  participate  in  teaching  or  research 
activities.  As  Pattison  and  Katz  (1983)  caution,  much  of  the 
recent  growth  and  changes  in  multihospital  systems  were 
based  on  managerial  strategies  which  were  well  suited  to  the 
reiirtbursement  system  and  context  of  the  1970s.    As  a  result 
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of  changes  at  the  state  and  federal  level,  many  of  these 
strategies  are  obsolete  and  the  implications  for  the  future 
effectiveness  of  multihospital  systems  remains  to  be  seen. 

CONCLUSIONS  AND  SUMMARY 

We  have  begun  to  open  up  the  organizational  "black  box" 
and  look  inside  the  hospital.  For  the  most  part  we  have  made 
some  inroads  on  trying  to  link  some  structural  factors  to 
quality  of  technical  care  (mostly  in  terms  of  outcomes  but 
some  processes  and  a  few  study  patient  satisfaction) .  Since 
we  also  have  established  that  structure  or  process  cannot  be 
assumed  to  link  perfectly  with  better  outcomes,  we  need  to 
unravel  more  carefully  the  processes  linking  structure  to 
quality  of  care  outcomes.  But  we  are  still  a  long  way  from 
understanding  these  complex  linkages.    Without  understanding 
at  this  levels  setting  up  incentives  or  imperatives  may 
reproduce  the  structural  characteristic  but  not  its  intended 
effect:  better  care  without  undue  costs.  The  legal  system 
and  health  policy  (especially  as  tied  to  reimbursement)  need 
to  recognize  the  choices  and  trade-offs  made  in  clinical 
decisions  to  provide  adequate  quality  of  care,  services,  and 
use  reasonable  resources  to  achieve  them.    Cost  containment 
based  on  the  assumption  that  only  inefficiencies  will  be 
eliminated  is  naive.    To  the  extent  that  we  want  to  help 
achieve  a  balance  between  expenditures  and  quality  of 
services  and  outcomes,  we  need  to  study  them  concurrently 
and  to  understand  more  fully  the  organizational  factors 
which  curtail  or  encourage  both,  effects.    The  research  which 
purports  to  examine  both  of  these  relationships  is  scant  and 
crude  indeed. 

The  past  few  years  have  also  pointed  out  the  fallacy  of 
stopping  with  the  hospital  level  in  organizational  analysis. 
Hospitals  have  many  complex  linkages  to  other  hospitals,  to 
other  health  care  organizations,  and  to  other  businesses  and 
groups  more  generally.    These  too  represent  organizational 
factors  which  can  influence  quality  of  care  as  well  as 
costs.    Moreover,  the  growing  literature  in  small  area 
analysis  and  the  well-established  regional  differences 
suggest  that  we  need  to  look  more  carefully  outside  the 
black-box  limited  to  health  care  organizations  to  a  larger 
context  of  influences  on  the  performance  of  care  in 
hospitals. 

A  review  of  the  literature  on  organizational 
determinants  reveals  only  a  skeleton  of  information 
regarding  these  processes  and  factors,  despite  the  great 
policy  iitportance  and  urgency  of  the  issues.    We  need  to 
rectify  this  problem  and  soon. 
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STRUCTURAL  MEASURES 

Are  these  measures  of  quality  unless  they  are 
predictors  of  patient  outcomes?  How  can  the  consumer 
of  health  care  decide  what  is  best? 


*  An  Orange  County  (California)  study  of 
quality  indicators  was  able  to  develop  sets 
of  structural  measures  that  consumers  could 
use  to  find  health  care  services  that  met 
their  needs. 

*  Structural  elements  can  also  serve  as  a 
measure  of  patient  satisfaction,  by 
identifying  hospital  programs  that  are  based 
on  a  management  commitment  to  patients1 
well-being  (e.g.,  interpreters  for  a  large 
population  of  non-English-speaking 
patients) . 

*  In  many  important  areas,  however,  the  use  of 
structural  measures  alone  can  give 
misleading  results.  They  must  be  coupled 
with  outcome  measures. 


By  Fred  Bodendorf,  Ph.D. 

I  have  been  asked  to  address  an  extremely  difficult 
question  relating  to  structural  elements  in  quality  of 
care.  I  must  confess  I  do  not  have  a  system  to  present  to 
demonstrate  the  use  of  structural  elements  that  are  not 
predictors  of  patient  outcomes.  What  I  would  like  to  do 
instead  is  to  describe  a  California  project  that  took  place 
while  I  was  with  the  Orange  County  Health  Planning  Council, 
in  which  we  grappled  with  questions  of  structural  elements, 
especially  those  elements  which  were  not  predictors  or  had 
not  been  looked  at  as  predictors  of  outcome.  I  will 
conclude  by  giving  my  viewpoint  on  the  pitfalls  of  using 
structural  elements  by  themselves. 


The  Orange  County  Project 

The  Orange  County  Health  Planning  Council  project  took 
place  in  1984  and  1985.  The  goal  was  to  develop  a 
comprehensive  list  of  quality  of  care  indicators.  There 
was  (and  is)  a  real  concern  that  competition  would  have  an 
adverse  impact  on  health  care,  as  purchasers  began  to  buy 
services  on  the  basis  of  cost  alone,  without  regard  to 
quality.  There  was  also  a  concern  about  Medicaid  in 
California.        The    state's    approach    to    contracting  with 
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hospitals  for  care  for  the  indigent  appeared  to  ignore 
quality  of  care  issues,  and  it  was  felt  that  we  could  end 
up  with  a  two-tier  system:  higher-quality  care  for  those 
who  could  afford  it,  and  poor-quality  care  for  those  who 
could  not.  Since  the  study  was  completed,  there  have  been 
those  who  claim  that  that  is  indeed  the  case  in  California 
today. 

For  the  project,  we  formed  a  task  force  of  physicians, 
hospital  representatives,  insurance  industry 

representatives,  nurses,  laboratory  specialists,  etc. — a 
group  representing  the  full  continuum  of  care.  We  used  a 
consensus-building  process  in  which  we  tried  to  learn  what 
each  member  of  the  task  force  thought  quality  really  was. 
Not  surprisingly,  everyone  had  a  very  strong  viewpoint.  It 
seems  that  everyone  knows  what  quality  is — but  when  you  try 
to  put  it  down  on  paper,  it  becomes  a  little  more 
difficult. 

Our  first  task,  then,  was  to  define  quality.  Our  second 
task  was  to  identify  specific  indicators  that  could  reflect 
quality.  We  started  with  a  very  large  list  and  narrowed  it 
down,  using  the  scheme  of  structure/process/outcome  to 
group  the  indicators  into  categories  that  made  sense.  We 
then  tried  to  develop  standards  for  measuring  the 
indicators.  This  involved  going  through  the  literature  to 
learn  what  indicators  had  been  measured  in  the  past  and 
whether  they  succeeded  in  predicting  outcomes.  If  we  found 
standards  published  in  the  literature,  we  adopted  them.  If 
there  were  no  published  standards  but  we  could  obtain  good 
data  to  measure,  we  tried  to  formulate  our  own  Orange 
County  standard.  And  if  there  were  no  data  available,  we 
tried  to  develop  a  way  to  measure  the  indicator.  We  then 
collected  data  from  hospitals  to  test  our  measurements. 

We  were  not  able  to  arrive  at  any  final  results.  I  think 
one  of  the  drawbacks  to  an  approach  that  uses  so  many 
different  kinds  of  indicators  is  that  one  cannot  produce  a 
final  measure  for  a  particular  provider  or  hospital  because 
not  enough  is  known  about  which  measures  are  most 
significant.  Nonetheless,  we  felt  the  project  was  useful. 
Examining  health  care  in  a  comprehensive  manner  allows 
users  of  these  data  to  take  the  indicators  they  feel  are 
most  important  and  utilize  them.  We  developed  some 
indicators  we  felt  consumers  could  use,  some  we  felt 
purchasers  could  use,  and  so  forth. 


The  Utility  of  structural  Measures 

There  is  a  wealth  of  information  showing  relationships 
between  outcomes  and  some  structural  elements  of  care.  For 
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examples 

*  Departmentalization  of  hospital  medical  staffs  and 
board  certification  of  physicians  are  important 
components  in  predicting  outcomes. 

*  The  structure  of  the  nursing  staff  is  extremely 
important :  the  proportion  of  R .  N .  s  and  L. P.N. s,  the 
number  of  R.N.s  per  patient,  the  number  of  hours  per 
patient  day  that  R.M.s  spend  at  the  bedside. 

*  Hospitals  with  a  wide  range  of  services  tend  to  have 
better  outcomes,,  as  do  hospitals  that  perform  a  large 
number  of  the  same  types  of  procedures. 

There  are,  however,  a  large  number  of  structural  elements 
where  this  relationship  is  not  evident.  Pennsylvania  has 
just  enacted  a  law  creating  a  data  collection  program  that 
is  mandated  to  measure  not  only  the  cost  of  care  but  the 
quality  of  care.  In  the  law,  provider  quality  is  defined 
as  "the  extent  to  which  a  provider  renders  care  that  within 
the  capabilities  of  modern  medicine  obtains  for  patients 
medically  acceptable  health  outcomes  and  prognoses, 
adjusted  for  patient  severity,  and  treats  patients 
compassionately  and  responsibly."  The  primary  focus  of 
this  legislation  is  clearly  on  outcomes?  however,  there  is 
that  mandate  to  look  at  whether  treatment  is  compassionate. 
The  task  force  in  Orange  County  wrestled  with  the  same 
question.  We  felt  very  strongly  that  treatment  of  the 
patient  and  patient  satisfaction  were  important  quality 
indicators,  but  we  found  it  very  difficult  to  develop  a 
standard  or  measure.  Outcome  measures  have  not  been  shown, 
to  my  knowledge,  to  relate  to  patient  satisfaction. 

In  considering  patient  satisfaction,  as  well  as  other 
structural  elements  that  may  not  relate  to  outcomes,  it  can 
be  helpful   to  look  at  two  extremes.      One  extreme  is  the 

fact  that  negative  hospital  outcomes  are  relatively  rare; 
most  hospital  patients  are  treated  in  a  satisfactory  manner 
and  come  out  better  than  when  they  went  it.  What  do  these 
patients  want  from  their  hospital  stay?  At  the  other 
extreme,  we  have  some  patients  with  a  very  small  chance  of 
a  positive  outcome.  What  do  these  patients  want?  Are  they 
looking  for  the  treatment  the  facility  provides?  More  than 
likely  they  are.  In  many  cases,  therefore,  the  consumer 
may  be  looking  for  information  other  than  outcome  measures. 
What  are  the  measures  of  quality  in  those  cases? 

The  most  common  way  to  measure  patient  satisfaction  is 
through  survey  methods,  We  in  Orange  County  took  a 
different  approach.  We  looked  for  the  existence  of 
programs     which     would      indicate     that     the  hospital's 
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management  was  concerned  about  patient  satisfaction  and 
felt  the  patient  was  important.  For  example,  if  the 
hospital  treated  a  sizeable  number  of  non-English-speaking 
patients,  did  they  provide  interpreters?  Did  they  provide 
alternative  delivery  services  such  as  home  health  care? 
Did  they  provide  those  types  of  care  that  are  not 
traditionally  thought  of  as  hospital  services?  We  felt 
that  the  very  existence  of  these  programs  was  an  indicator 
of  quality,  because  it  reflected  the  hospital's  overall 
philosophy  of  care. 

I  think  this  type  of  information  is  important  for 
consumers.  The  patient  going  in  to  deliver  a  baby,  for 
example,  would  probably  be  less  concerned  with  the 
likelihood  of  death  in  the  hospital  than  with  the  type  of 
setting  they  offer.  This  is  information  the  consumer  can 
understand,  and  can  use  to  shop  for  care. 


The  Drawbacks  of  Structural  Measures 

I  have  a  concern  that  structural  elements  not  be  used  alone 
to  measure  quality  of  care.  To  take  a  major  example  at  the 
national  level,  many  researchers  want  to  study  the  impact 
of  deregulation.  As  certificate  of  need  requirements  are 
eliminated  or  curtailed,  we  will  be  adding  more  hospital 
beds  and  more  services.  A  purely  structural  analysis  would 
indicate  that  the  more  you  have,  the  better  the  quality — ■ 
yet  many  people  are  now  saying  that  deregulation  will  have 
the  opposite  effect.  Obviously,  we  need  outcome  measures 
to  answer  that  question. 

We  also  need  outcome  measures  to  study  the  question  of 
early  discharges  the  "sicker  and  quicker"  phenomenon.  We 
must  have  measures  that  show  how  sick  patients  are  when 
they  go  into  the  hospital  and  how  well  they  are  when  they 
come  out.  And  we  need  outcomes  to  judge  the  impact  of 
medical  innovations;  the  mere  fact  that  a  hospital  has  a 
new  service  or  new  technology  may  not  indicate  good  quality 
of  care. 

At  the  regional  level,  I  think  studies  dealing  with 
variations  in  utilization  of  health  services  and  variations 
in  surgical  procedures  would  not  be  as  interesting  without 
outcome  measures,  which  we  need  if  we  are  going  to  effect 
change  in  utilization.  Rural  vs.  urban  questions  also  play 
an  important  role  at  the  regional  and  state  levels. 
Structural  measures  of  care  tend  to  penalize  rural 
hospitals  in  favor  of  larger  urban  hospitals.  What  can 
happen  if  one  focuses  exclusively  on  structural  elements  is 
that  one  is  forced  to  develop  separate  standards  for  urban 
and  rural  institutions. 
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To  move  down  to  the  community  scale,  if  we  use  structural 
measures  alone  to  guide  purchasers  in  selecting  hospitals, 
obviously  they  will  steer  their  patients  to  larger 
institutions  because  of  the  relationship  between  size  and 
quality.     We  need  outcomes  to  help 

purchasers  identify  the  exceptions  to  the  general  rule. 

There  are  many  quality  questions  that  cannot  be  answered  by 
structural  measures  alone,  and  we  really  do  not  know  enough 
about  the  interrelationships  among  structural  measures  to 
use  them  to  predict  outcomes.  I  do  not  believe  that  many 
of  the  important  uses  of  quality  data  are  possible  without 
outcome  measures. 


*        *  * 

Dr.  Fred  Bodendorf  holds  a  Ph.D.  in  geography.  He  is 
currently  Assistant  Director  of  the  Pennsylvania  Health 
Care  Cost  Containment  Council,  whose  mandate  includes 
monitoring  quality  of  care.  Dr.  Bodendorf  was  previously 
Director  of  Research  for  the  Orange  County  Health  Planning 
Council,  where  he  developed  a  publication  titled  "A 
Hospital's  I.Q.:     Indicators  of  Quality"  for  consumers. 
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STRUCTURAL  MEASURES 

Basic  elements  and  issues  of  a  revised  assessment 
process:  clinical  indicators,  severity  adjustment, 
and  organizational  indicators 

*  The  Joint  Commission  on  Accreditation  of 
Hospitals  is  now  developing  a  new  system 
which  highlights  outcomes  as  indicators  of 
quality.  This  represents  a  major  change 
from  the  current  system's  focus  on  process 
and  structure. 

*  The  new  system  also  departs  from  the  current 
triennial  on-site  survey  and  moves  to 
continuous  monitoring  of  data,  which  is 
reviewed  by  JCAH  and  returned  to  the 
hospital  for  use  in  its  own  peer  review 
process. 

*  The  system  has  five  building  blocks: 
clinical  indicators,  organizational 
indicators,  severity  adjustments,  data 
systems,  and  a  results-oriented  survey 
process . 


By  James  Prevost,  M.D. 

The  Joint  Commission  on  Accreditation  of  Hospitals  has 
launched  a  major  developmental  process  that  will,  given  our 
objectives,  change  the  way  we  do  business.  I  would  like  to 
tell  you  about  that  today. 

It  would  be  a  major  understatement  to  say  that  the  health 
care  system  in  this  country  is  undergoing  major 
reconfiguration.  Driven  by  a  number  of  imperatives, 
chiefly  the  need  for  cost  constraint,  there  have  been  many 
changes:  segmentation  of  the  marketplace,  vertical  and 
horizontal  integration,  competition  among  providers, 
alternative  delivery  system  development.  It  is  important 
to  note  that  these  changes  have  at  least  the  potential  to 
affect  quality  of  care. 

Where  there  was  once  an  assumption  of  quality,  consumers 
and  government  and  payors  are  now  asking  about  the  quality 
of  services  provided  given  their  cost.  What  is  their 
value?  It  is  our  view- that  one  cannot  look  at  the  question 
of  value  by  merely  making  an  equation  to  efficiency.  One 
must  also  consider  quality. 
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It  is  not  sufficient,  however,  to  answer  the  question  of 
whether  an  institution  can  provide  quality  of  care.  This 
is  merely  a  matter  of  capacity,  perhaps  of  structure  and 
process,  perhaps  of  compliance  with  current  JCAH  standards. 
Today  there  is  a  different  question:  does  an  institution 
provide  quality  of  care?  The  major  stimulus  for  the 
changes  the  Joint  Commission  is  about  to  make  is  our 
attempt  to  help  an  institution  answer  that  question. 


The  Agenda  for  Change 

Our  board  debated  this  issue  in  the  fall  of  1985,  and  by 
last  fall  gave  the  go-ahead  for  what  we  see  as  a  major 
change  in  our  system.  We  want  not  just  a  change,  but  the 
right  system— a  valid  one,  one  that  can  continue  to  meet 
the  needs  ©f  a  rapidly  changing  industry,  one'  that 
providers  of  care  can  accept.  We  call  this  our  Agenda  for 
Change . 

The  Agenda  for  Change  has  two  major  themes.  The  first  is  a 
movement  away  from  our  traditional  emphasis  on  structure 
and  process  standards,  to  a  greater  emphasis  on  outcome 
through  the  use  of  indicators  of  clinical  care  which  will 
allow  an  analysis  of  clinical  activities,  plus  indicators 
that  will  allow  us  to  analyze  the  organizational 
characteristics  and  managerial  activities  that  appear  to 
have  an  impact  on  quality.  We  mean  to  use  these  indicators 
as  screens,  since  we  do  not  believe  that  one  can  determine 
quality  from  external  review  and  measurement  of  indicators; 
that  judgment  must  come  from  local  peer  review.  I  would 
like  to  add  that  we  are  not  walking  away  from  structure  and 
process.    We  feel  they  are  as  important  as  ever. 

The  second  major  theme  is  to  move  away  from  our  rather 
standardized  three-year  on-site  survey  process  to  a  more 
continuous  monitoring  system  involving  the  indicators  I 
have  just  described.  We  hope  to  receive  that  information 
from  the  institutions  we  accredit,  review  it,  and  feed  it 
back  to  the  institutions  for  their  own  review.  :  They  will 
be  able  to  see  how  they  are  progressing  against  their  own 
objectives,  institutions  that  are  similar  to  theirs, 
national  norms,  and  the  criteria  that  the  Joint  Commission 
will  be  sanctioning. 


Five  Building  Blocks 

The  two  themes  of  the  Agenda  for  Change  are  manifested 
across  five  tracks  or  building  blocks.  The  first  is  the 
identification  and  selection  of  clinical  indicators,  a 
process   that  the   Joint   Commission   actually   started  three 
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years  ago  in  an  attempt  to  clarify  the  problem-solving 
process  called  quality  assurance.  We  asked  hospitals  to 
define  the  scope  of  their  services,  and  within  that  scope 
to  identify  the  important  aspects  of  care  (i.e.,  the  high- 
risk,  high-volume,  or  problem-prone  areas).  We  then  asked 
them  to  develop  their  own  indicators  for  these  areas  and  to 
set  the  measurements  or  thresholds  or  criteria  they  would 
find  acceptable.  Where  there  were  outliers,  we  asked  the 
hospitals  to  determine  whether  they  were  bona  fide  and  to 
correct  them  if  they  were  not. 

Now  our  aim  is  to  determine  the  indicators  relevant  to  25 
or  26  clinical  specialties,  those  that  cut  across  hospital 
departments  and  those  that  are  congruent  with  good 
organizational  practice.  We  have  created  several  task 
forces:  one  dealing  with  obstetrical  care,  another  with 
anesthesia,  another  with  hospital-wide  indicators.  Since 
we  obviously  do  not  have  the  staff  to  run  26  task  forces 
over  the  next  three  years,  we  are  also  going  to  develop 
liaisons  with  clinical  specialty  groups  that  will  share 
information  and  provide  resources  in  some  of  the  clinical 
areas.  This  fall  we  will  have  a  forum  of  representatives 
from  these  groups  to  start  the  process  rolling. 

The  second  building  block  is  to  develop  organizational 
indicators.  In  1918,  the  original  set  of  hospital 
standards  took  up  only  a  single  sheet  of  paper.  We  now 
have  270  pages  in  our  manual  for  hospitals — and  it  cannot 
be  true  that  all  of  those  items  are  of  equal  importance.  A 
task  force  of  national  experts  has  identified  103 
organizational  areas  that  they  believe  (in  concert  with 
review  of  the  current  literature)  to  have  an  impact  on 
quality.  The  task  now  is  to  consolidate  them  into  groups 
and  categories,  and  to  study  indicators  that  have  already 
been  developed.  One  uncertainty,  incidentally,  is  whether 
we  should  include  risk  management  in  our  organizational 
areas. 

The  third  objective  is  to  focus  on  severity  adjustment  for 
both  clinical  and  organizational  measurements — -and  here  we 
are  facing  a  dilemma.  We  have  done  our  first  review  of 
work  in  this  area,  and  we  find  that  there  are  differences 
in  basic  concepts  among  different  approaches;  that 
indicators  identified  as  dependent  variables  in  some 
methodologies  are  independent  variables  in  others;  that 
thinking  in  this  field  continues  to  change;  and  that  the 
sorts  of  adjustments  existing  methodologies  make  may  not 
fit  the  indicators  our  work  groups  are  developing.  At  the 
moment  our  inclination  is  to  look  at  adjustments  as 
indicator-specific  rather  than  across  populations,  but  we 
will  continue  to  look  closely  at  the  problem. 
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The  fourth  track  relates  to  data  systems,  and  it  raises  a 
host  of  questions.  One  is  the  need  to  find  a  relational 
data  base  so  that  what  we  ask  for  does  not  place  an  undue 
burden  on  the  institutions  we  accredit.  We  also  need  to 
find  a  way  to  analyze  the  information  and  return  it  in  a 
format  that  is  relevant  to  the  institution. 

The  fifth  and  last  building  block  is  to  design  an  on-site 
survey  process  that  differs  radically  from  what  we  have 
been  conducting  for  the  past  37  years.  We  now  have  an  on- 
site  survey  that  starts  with  policies,  procedures,  and 
administrative  processes,  and  we  determine  whether  there  is 
compliance  with  standards  in  those  areas.  We  intend  to 
make  a  180-degree  turn  and  use  continuous  information  on 
results-oriented  indicators.  We  will  then  work  backwards 
through  the  decision  trees  with  the  staff  of  the 
institution,  looking  at  structures  and  processes  (both 
administrative  and  clinical)  that  could  impact  the  outcomes 
our  indicators  have  measured.  One  key  component  in  this 
new  system  will  be  an  entirely  new  kind  of  surveyor.  He  or 
she  could  be  a  systems  diagnostician— perhaps  specializing 
in  either  the  clinical  or  the  organizational  area — and 
certainly  will  need  to  have  a  wealth  of  experience  in  the 
assessment  of  data. 


Conclusion 

Obviously,  this  ambitious  project  will  require  more  than 
the  Joint  Commission's  own  resources.  We  are  not  primarily 
a  research  organization,  and  we  count  heavily  on  outside 
research  in  the  field.  We  need  the  answers  to  a  number  of 
major  questions:  Do  the  indicators  we  are  identifying 
accurately  portray  the  institution?  Are  they  relevant,  are 
they  discrete,  are  they  measurable?  Can  a  hospital  collect 
that  information?  How  burdensome  is  it?  Does  collecting 
the  information  help  in  other  sorts  of  activities?  How 
might  we  best  analyze  the  data?  Are  the  variables  for 
adjustment  appropriate?  Was  the  methodology  applied 
appropriate?  What  degree  of  detail  do  we  need  in  our 
adjustments  when  we  are  using  these  measurements  as 
screens,  and  is  that  different  than  the  detail  needed  for 
the  local  peer  review  component  of  the  new  system?  How 
does  the  new  survey  process  operate  compared  to  the  present 
system,  and  what  sort  of  person  do  we  really  need  as  a 
surveyor? 

We  will  begin  to  get  some  answers  very  soon.  We  will 
select  pilot  hospitals  by  the  end  of  June,  collect  data 
through  the  rest  of  1987,  and  begin  pilot  tests  next 
spring.  In  1989  there  will  be  a  dry  run  of  the  new  survey 
process  with  some  100  hospitals;  by  this  time  we  will  have 
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recruited  and  trained  our  new  surveyors,  and  they  will  be 
in  the  hospitals  for  one  final  test.  By  1990,  the  first 
one-third  of  the  institutions  we  accredit  will  come  under 
the  new  system  if  all  goes  well.  Our  other  programs — long- 
term  care,  hospices,  mental  health  facilities,  ambulatory 
care — will  follow  at  staggered  one-year  intervals. 


*        *  * 

Dr.  James  Prevost,  a  psychiatrist,  is  Director  of  the  JCAH 
Department  of  Research  and  Development.  He  was  formerly  an 
Associate  Professor  of  Psychiatry  at  the  State  University 
of  New  York  (Syracuse)  College  of  Medicine,  serving 
concurrently  as  Director  of  the  Syracuse  Psychiatric 
Hospital  and  then  the  Richard  Hutchings  Psychiatry  Center. 
Dr.  Prevost  also  spent  five  years  as  Commissioner  of  the 
New  York  State  Office  of  Mental  Health. 
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FEEDBACK  AND  OTHER  ISSUES 


Applications  of  feedback  to  control  medical  care 
quality 

*  Feedback  is  data  about  past  performance 
which  is  measured,  compared  to  desired 
standards ,  and  disseminated  to  those  who  are 
decision-makers  in  that  performance . 

*  Feedback  does  not  occur  naturally.  There 
must  be  a  systematic  effort  to  collect  and 
evaluate     the     information,      and     then  to 

motivate  decision-makers  to  use  it  to 
improve  performance. 

*  Based  on  case  studies  of  feedback  to 
physicians  in  a  hospital  setting,  as  well  as 
on  the  theoretical  literature,  criteria  for 
effective  medical  feedback  have  been 
developed. 


By  Joseph  D.  Restuccia,  Dr.P.H. 

Feedback  is  data  about  past  performance  which  is 
disseminated  to  those  who  are  decision-makers  in  that 
performance.  Figure  1  is  a  simple  model  showing  how  the 
process  works.  It  starts  with  clinical  decisions  and 
management  decisions  which  affect  medical  care.  Often 
overlooked  is  the  fact  that  management  decisions  create  the 
environment  in  which  clinicians  work.  To  the  extent  that 
the  environment  is  supportive,  it  can  enhance  quality?  to 
the  extent  that  it  is  not  supportive,  it  can  impair 
quality.  Management  and  clinical  decisions  interact  to 
produce  actual  performance. 

Performance  can  be  measured  and  compared  to  desired 
standards.  Standards  can  be  normative  or  empirical,  and 
the  evaluation  can  be  explicit  or  implicit.  The  evaluation 
may  also  take  uncertain  events  into  account — and  it  should, 
if  uncertain  events  occur. 

Evaluative  information  is  then  fed  back  to  the  decision- 
makers, and  we  assume  they  will  use  the  information  to 
correct  errors,  to  adapt  future  performance,  to  learn,  and 
perhaps  to  change  they  way  they  do  things— permanently — as 
well  as  correcting  specific  errors.  Feedback  can  done  on 
an  individual,  group,  or  organizational  basis. 
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Two  things  should  be  noted.  One  is  that  feedback  does  not 
occur      naturally      in      organizations.  It      must  be 

systematically  conducted;  one  has  to  go  to  the  effort  of 
measuring  performance  and  the  effort  of  feeding  information 
back  to  decision-makers.  A  second  point  to  note  is  that 
the  decision-makers  must  be  motivated  to  use  the  feedback. 
Just  because  they  get  the  information  does  not  mean  they 
will  act  on  it  to  improve  performance. 

Case  Study  #1 

Figure  2  shows  a  study  conducted  at  the  Health  Care 
Research  Center  at  Boston  University  School  of  Medicine. (1) 
It  included  seven  New  England  hospitals,  ranging  from  a 
150-bed  community  hospital  to  an  800-bed  urban  teaching 
hospital,  all  of  whom  agreed  to  participate. 

The  objective  was  to  learn  if  feedback  on  the 
appropriateness  of  hospital  admissions  and  days  of  care 
would  improve  attending  physicians'  performance  in  these 
areas.  On  a  quarterly  basis,  data  collected  by  the  UR 
coordinators  at  the  hospitals  was  analyzed,  put  into  the 
form  of  service-specific  reports,  and  fed  back  to  the 
hospitals  for  use  in  educational  programs  with  their 
attending  staffs.  The  hospitals  had  to  agree  that  the 
feedback  would  get  to  the  level  of  the  attending 
physicians,  because  they  were  the  decision-makers. 

Figure  2  shows  the  results.  Admissions  shows  a  decrease 
from  the  pre-  to  post-study  periods  for  both  the  treatment 
and     control      groups.  The     percentage     decrease  in 

inappropriate  admissions — patients  who  either  do  not  need 
to  be  in  the  hospital  at  all  or  who  are  admitted  too  early 
with  unnecessary  pre-operative  days— went  from  about  14.5% 
to     about     10-10.5%      in     both     groups.  There     is  no 

statistically  significant  difference:  a  disappointing 
result. 

Inappropriate  patient  days  for  both  groups  started  at 
around  37%-38%,  a  figure  that  may  be  somewhat  high  because 
we  used  the  day  of  discharge,  which  is  the  day  most  likely 
to  be  inappropriate,  rather  than  a  randomly-selected  day. 
At  the  end  of  the  study  this  figure  had  decreased  slightly 
in  the  treatment  group  and  increased  slightly  in  the 
control  group,  for  a  marginally  significant  net  difference 
of  3.5%:     another  disappointing  result. 

As  good  researchers,  we  had  the  obligation  to  learn  why  the 
feedback  had  failed.  A  survey  showed  that  in  less  than 
half  of  the  hospitals  did  the  attending  physicians — the 
specific  group  who  had  been  targeted  for  the  feedback — ever 
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receive  it.  Thus  there  was  no  possibility  whatsoever  that 
the  feedback  could  be  useful. 

One  hospital  was  an  exception.  It  showed  a  drastic  decline 
in  inappropriateness  rates  for  both  admissions  and  patient 
days.  The  difference  between  that  hospital  and  the  others 
was  that  they  systematically  used  the  information  we 
provided.  In  fact,  they  were  not  satisfied  with  our 
service-specific  data  and  asked  us  to  help  them  break  it 
down     into     physician-specific     profiles.  They  even 

programmed  a  little  Apple  computer  to  produce  nice  color 
bar  graphs  for  the  physicians,  comparing  their  performance 
to  other  members  of  the  medical  staff. 

They  also  decided  to  use  the  Appropriateness  Evaluation 
Protocol  criteria  on  a  concurrent  basis.  The  UR 
coordinators  used  this  tool  to  screen  cases  that  were  then 
kicked  out  to  the  physician  advisor.  This  doctor,  who  was 
paid  to  spend  a  significant  portion  of  her  time  as  medical 
director  of  the  utilization  review/ quality  assurance/risk 
management  function,  contacted  physicians  whenever  an 
inappropriate  admission  or  inappropriate  day  was  found.  In 
addition,  the  Emergency  Department  physicians  at  the 
hospital  decided  to  use  these  criteria  as  guidelines  in 
deciding  whether  to  admit  patients  presenting  in  the  ER. 
The  difference,  therefore,  is  that  this  hospital  took  the 
study  seriously.  They  used  the  information,  they  provided 
it  to  physicians,  and  they  motivated  the  physicians  to  use 
it  to  improve  performance.  And  the  improvement  was 
dramatic.  At  the  beginning  of  the  study,  inappropriate 
admissions  were  about  1  in  4;  by  the  end,  they  were  less 
than  1  in  20. 

Case  Study  #2 

The  next  study  comes  from  England. (2)  Figure  3  shows  the 
rate  of  two  performance  measures:  rate  of  ruptured 
appendix  or  abscess,  and  negative  laparotomy  rate.  We  can 
see  that  when  this  hospital  started  the  study,  the  rate  of 
ruptured  appendices  was  alarmingly  high  and  the  rate  of 
negative  laparotomy  was  high  as  well.  During  the  study's 
19  months,  both  rates  declined  dramatically,  only  to  rise 
again  after  the  study  ended, 

What  is  interesting  is  that  this  is  not  an  explicit  quality 
assurance  study.  Instead,  physicians  who  treated  potential 
appendicitis  were  asked  to  help  develop  a  computer  program 
which  would  be  used  to  assist  in  making  this  diagnosis. 
For  this  purpose,  they  were  asked  to  document  their 
justification  for  appendectomy  very  thoroughly.  Once  this 
process  started,  someone  decided  to  see  whether  it  made  any 
difference     in     performance.         It    did    make    a  dramatic 
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difference,  by  virtue  of  motivating  the  physicians  to  be 
very  careful  in  their  decisions. 

Other  Case  Studies 

Figure  4  shows  a  study  conducted  by  a  teaching  hospital  to 
determine  whether  an  awareness  program  for  nosocomial 
infection  could  effect  an  improvement. (3)  The  program  was 
targeted  to  both  nurses  and  physicians.  In  addition  to  the 
usual  lectures  and  printed  materials,  the  hospital 
collected  data  on  nosocomial  infection  rates  and  posted 
them  on  each  of  the  wards.  Each  ward  could  see  what  its 
rate  was:  in  the  past,  at  present,  and  compared  to  other 
wards.  In  periods  2  and  4  on  the  chart,  when  the  program 
was  in  effect,  it  cut  the  nosocomial  infection  rate  in 
half. 

The  next  example  comes  from  the  Delmarva  Peninsula,  where 
the  Peer  Review  Organization  helped  Boston  University 
pilot-test  the  Appropriateness  Evaluation  Protocol  several 
years  ago.  (4)  They  liked  the  AEP  criteria  and  decided  to 
use  them  in  a  physician-specific  performance  measurement 
and  feedback  system.  In  Figure  5,  the  first  four  periods 
give  the  baseline  information  per  calendar  quarter,  and 
the  rate  of  inappropr  lateness  runs  about  25%.  After 
feedback,  the  rate  dropped  to  about  15%.  That  is  not 
perfect,  but  we  are  not  looking  for  perfection;  we  are 
looking  for  improvement,  and  the  improvement  here  is 
significant.  It  should  be  noted  that  this  program  did  not 
invoke  sanctions — denying  claims  or  withholding  payments — 
although  that  threat  was  implicit. 

The  final  example  shows  a  concurrent  review  system. (5)  In 
California  a  few  years  ago,  we  conducted  a  study  with  four 
hospitals.  We  wanted  to  test  a  very  simple  hypothesis: 
whether  having  UR  nurse  coordinators  provide  information  on 
appropriateness  of  care  to  attending  physicians  directly 
would  have  any  effect  on  their  decision-making.  In 
particular,  we  wanted  to  see  if  it  would  reduce  length  of 
stay  and  inappropriate  days  of  care.  Figure  6  shows  the 
results  compared  to  a  control  group  and  a  group  in  which 
the  information  was  provided  by  UR  physician  advisors.  The 
rates  decreased  significantly  (by  20%  and  16%  respectively) 
for  the  treatment  group,  while  the  two  other  groups  showed 
no  change. 

Lessons  Learned 

The  theoretical  literature  on  feedback  comes  from 
organizational  behavior  and  cybernetic  theory. (6)  Although 
it  is  an  extensive  literature,  there  have  been  very  few 
systematic    applications    in    medicine.        If    we    take  this 


14-4 


material,  plus  the  experience  of  the  studies,  we  can  garner 
a  number  of  useful  criteria: 

Credible.  The      information     provided— the  data 

feedback — -must  be  credible.  In  medicine  in  particular,  it 
must  be  clinically  credible  if  it  is  to  have  any  impact  on 
physicians.     Physicians  must  see  it  as  valid. 


Relevant.  Feedback  must  be  relevant  to  the  individual 
physician's  practice,  not  to  some  other  area  or  some  other 
specialty.  For  example,  I  personally  do  not  think  that 
length-of-stay  profiles  and  cost  profiles  are  meaningful  to 
physicians.  That  is  why  we  see  so  little  effect  from 
feeding  them  back,  no  matter  how  well  the  data  are 
manipulated  and  the  case  mix  adjusted.  These  measures  are 
not  as  meaningful  as  data  on  the  actual  care  of  patients. 

Verifiable.  For  example,  it  is  preferable  wherever 
possible  to  include  case  numbers  in  information  about 
exceptions  and  problem  cases.  Let  the  physician  go  back 
and  see  the  case  for  himself  or  herself,  and  decide  whether 
he  or  she  agrees  with  the  information  the  system  provides. 
This  also  means  that  we  cannot  have  black-box  systems  where 
the  decision  algorithms  are  opaque  to  the  physician;  these 
systems  will  not  be  trusted. 

Comprehensible .  Physicians  must  be  able  to  understand 
the  information.  A  picture  is  worth  a  thousand  numbers;  if 
you  can  put  the  data  in  graphic  format,  they  are  much  more 
likely  to  have  an  effect. 

Specific.  Information  should  be  as  specific  as 
possible  while  maintaining  sample  size  considerations. 
Information  at  the  level  of  the  individual  physician  is 
best  of  all. 

Comparative.  Peer  group  comparisons  are  important 
wherever  you  can  provide  them,  so  that  an  individual  can 
evaluate  his  or  her  performance  relative  to  that  of  a  group 
with  which  he  or  she  identifies. 

Unfinalized.  When     the     feedback     system  stops, 

performance  will  revert  to  its  former  level.  The  system 
must  be  ongoing. 

Reinforced.  Finally,  it  is  necessary  to  motivate  the 
decision-maker  to  use  the  information.  There  are  certain 
intrinsic  rewards  that  arise  from  the  professionalism  of 
physicians  and  other  practitioners.  If  authority  figures 
endorse  the  information  as  well,  however,  the  individual 
physician    is    much   more    likely    to    pay    attention    to  it. 
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Financial  rewards  and  sanctions  provide  another  kind  of 
incentive. 

*        *  * 

Dr.  Joseph  D.  Restuccia  holds  his  doctorate  in  Public 
Health,  and  he  is  Associate  Professor  of  Health  and 
Operations  Management  at  Boston  University.  He  has  also 
served  as  consultant  to  many  state  and  federal  agencies, 
hospitals,  and  health  insurers. 


References 

1.  Restuccia,  J.D.,  Payne,  S.M.C.,  Welge,  C.H. ,  et  al. 
Reducing  Inappropriate  Use  of  Inpatient  Medical/Surgical 
and  Pediatric  Services.  HCFA  Contract  No.  18-C-98317/1-02 , 
March  1986,  NTIS  Publ.  No.   PB  87-112041/AS . 

2.  de  Dombal,  F.T.,  Leaper,  D. J. ,  Horrochs,  J.C.  Human  and 
Computer-Aided  Diagnosis  of  Abdominal  Pain:  Further  Report 
with  Emphasis  on  Performance  of  Clinicians.  British 
Medical  Journal  1:376-380,  1974. 

3.  Britt,  M.R.,  Schleupner,  C.L.J. ,  Matsumiya,  S.  Severity 
of  Underlying  Disease  as  a  Predictor  of  Nosocomial 
Infection.  Journal  of  the  American  Medical  Association 
239:1047-1050,  1978. 

4.  Borchardt,  P.J.  Nonacute  Profiles:  Evaluation  of 
Physicians'  Nonacute  Utilization  of  Hospital  Resources, 
Quality  Review  Bulletin 

2:21-26,  1981. 

5.  Restuccia,  J.D.  The  Effect  of  Concurrent  Feedback  in 
Reducing  Inappropriate  Hospital  Utilization,  Medical  Care 
20:46-62,  1982. 

6.  Nadler,  D.A.  Feedback  and  Organization  Development; 
Using  Data-Based  Methods.  Reading,  MA:  Addison-Wesley, 
1977. 


14-6 


z 

<  H 

E-  Z 

o  Si 

 » 

z 

a 

w 

z 
J  < 

<  SB 

E-  O 

<  b: 
u 
a 


< 

CD 

a 
w 

Ed 


14-7 


FIGURE  2 


FEEDBACK  STUDY  RESULTS 

ADMISSIONS : 

PRE  POST  CHANGE 

TREATMENT  14.5  10.7  -3.8 

CONTROL  14.7  10.0  -4.7 

N.S. 


DAYS  : 

TREATMENT  36.8  34.8  -2.0 

CONTROL  38.3  39.8  1.5 

P<  .05 


14-8 


FIGURE  3 


£j  Ruptured  appendix  or  abscess 


Negative  laparotomy  for 
Non-specific  severe  abdominal  pain 


OF 

study 


k  MONTHS  AFTER 

stw 


Changes  in  frequency  of  ruptured  appendix 
or  Abscess  and  of  negative  laparotomies  during 
after  study  of  com3  uter- assisted  diagnosis, 
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FIGURE  tt 


EFFECT  OF  AWARENESS  PROGRAM 
ON  NOSOCOMIAL  INFECTION  RATES 


10 
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-  BRITT  M.  et  al.  jm  MARCH  13,  1978 
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FIGURE  5 
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FIGURE  6 


IMPACT  OF  CONCURRENT  UTILIZATION  REVIEW 
IN  FOUR  CALIFORNIA  HOSPITALS 

•     INAPPROPRIATE  DAYS  DOWN  20% 
t     LENGTH  OF  STAY  DOWN  16% 


-  RESTUCCIA,   JD.     THE  EFFECT  OF  CONCURRENT  FEEDBACK  IN 

REDUCING  INAPPROPRIATE  HOSPITAL  UTILIZATION,  MEDICAL  CARE, 
JANUARY  1982. 
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FEEDBACK  AND  OTHER  ISSUES 


The  Maine  experience:  What  is  effective — -education,  a 
carrot,  or  a  stick? 

*  The  Maine  Medical  Assessment  Project  uses 
small  area  analysis  of  hospital  discharge 
data  to  educate  physicians  about  variations 
in  practice  patterns,  It  provides  feedback 
in  a  confidential  and  non-punitive  setting. 

*  The  Project  is  based  on  the  belief  that  high 
variations  are  not  consistent  with  good 
medical  care  in  that  they  represent  a  lack 
of  consensus  among  physicians.  Its  feedback 
succeeds  in  lowering  rates  by  showing 
physicians  that  their  practice  deviates  from 
the  norm,  but  it  does  not  create  consensus. 

*  To  achieve  underlying  agreement  about  which 
methods  of  treatment  are  preferable,  outcome 
studies  are  needed.  The  Maine  Project  is 
moving  in  this  direction. 


By  Robert  Keller,  M.D. 

The  Maine  Medical  Assessment  Project  has  been  in  existence 
for  six  years.  Based  on  the  small  area  analysis  concept, 
it  uses  total  hospital  discharge  data  to  collect  variations 
information     on     specific  diagnoses,      diseases,  and 

procedures . 

We  work  through  specialty-oriented  study  groups.  One 
cannot  work  with  general  populations  of  physicians 
effectively.  They  do  not  seem  to  understand  the 
information  they  receive,  and  they  reject  it.  Specialty 
study  groups,  on  the  other  hand,  have  been  quite  successful 
in  dealing  with  physicians,  and  Figure  1  shows  the  seven 
groups  currently  active  in  Maine.  Each  is  led  by  a 
practicing  physician,    and  most  of  the  physicians  involved 


Figure  Is     Study  Groups 

Urology         Orthopaedics         Obstetrics/ Gynecology 
Medicine        Pediatrics  Ophthalmology 
Family  Practice 
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We  go  through  a  process  of  accepting  data,  evaluating, 
refining,  studying,  and  finally  making  recommendations.  We 
feel,  as  do  all  people  who  look  at  data,  that  high 
variation  in  rates  of  treatment,  admission,  or  other 
measures  is  not  consistent  with  good  medical  care.  We  also 
believe  that  variations  are  primarily  driven  by  a  lack  of 
consensus  among  physicians,  although  there  may  be  other 
reasons  as  well. 

Our  program  does  not  measure  quality  directly.  We  have 
often  discussed  quality  measurements,  but  we  do  not  at  this 
time  have  any  quality  screens.  We  believe,  however,  that 
the  study  group  process  results  in  improvement  in  quality 
of  care. 


Figure  2 :     Feedback  Process 

1.  Meeting  with  involved  physicians. 

2.  Presentation  of  data  and  education  regarding  rates. 

3.  Confidential,  educational,  non-threatening  feedback. 

4.  Peer  Pressure  —  The  Key  Ingredient. 


Figure  2  reviews  our  feedback  process.  It  is  rather 
simple:  we  meet  with  doctors  who  represent  high,  medium, 
and  low  rate  areas,  we  discuss  the  data  and  the  variations, 
and  we  make  them  aware  of  their  rates.  Usually  they  are 
not  aware:  the  average  physician  in  practice  does  not  know 
his  rate  for  various  procedures  or  how  it  compares  with 
others'.  We  do  piecework  in  medicine.  Practicing 
physicians  go  from  one  case  to  the  next,  and  they  have  no 
idea  where  they  are  in  the  grand  scheme  of  things  until 
someone  tells  them. 

Of  all  the  program  elements  that  should  be  emphasized,  I 
believe  the  most  important  is  that  our  feedback  system  is 
confidential,  non-threatening,  non-regulatory,  non- 
punitive,  and  educational.     That  is  why  it  works. 


Laminectomy  Variations 

To  move  to  a  specific  example  of  what  we  have  done,  Figure 
3  shows  the  Maine  rates  for  lumbar  laminectomy,  1980-84. 
This  is  one  of  the  highly  variable  procedures  for  which 
there  is  a  marked  lack  of  consensus.     It  has  been  monitored 
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by  the  orthopedic  study  group  of  orthopedic  surgeons  and 
neurosurgeons,  both  of  whom  do  this  procedure.  We  found 
that  the  rates  increased  slowly  through  the  early  1980s, 
then  took  a  dramatic  and  unexpected  jump  in  1983  and  1984. 
Further  analysis  showed  that  most  of  the  increase  occurred 
in  one  large  urban  area  and  three  surrounding  communities 
(Figure  4) . 

This  called  for  action.  We  met  with  physicians  and 
discussed  the  clinical  issues,  indicating  to  them  amongst 
their  peers  in  a  confidential  setting  that  we  did  not  think 
the  increase  in  the  rate  was  appro-priate.  Using  this 
method  alone,  we  were  able  to  effect  a  decrease  in  lumbar 
laminectomy  in  less  than  a  year;  and  by  the  end  of  1986,  as 
Figure  5  shows,  the  rate  had  declined  to  what  one  would 
anticipate  for  the  state  average.  Figure  6  demonstrates 
the  downward  trend  in  overall  state  rates  for  this 
procedure . 

Why  did  these  rates  change?  The  only  explanation  we  have 
found  is  the  in-migration  of  several  new  physicians-- highly 
qualified,  very  able,  and  vigorous  doctors.  When  they 
learned  what  their  rates  were,  when  they  heard  their  peers 
around  the  state  in  open  and  very  frank  discussion  talk 
about  what  was  right  and  wrong  in  high  rates  of  surgery  for 
these  particular  patients,  they  listened  and  their  rates 
changed . 


Other  Variations 

The  orthopedic  and  neurosurgical  experience  with 
laminectomy  has  been  paralleled  by  other  study  groups. 
Figure  7  illustrates  a  long-standing  problem  in  Maine. 
Hysterectomy  is  highly  variable,  and  the  chart  shows  a 
region  where  the  rates  were  significantly  above  the  state 
average  for  many  years.  It  took  a  long  time  to  convince 
physicians  in  that  region  to  bring  their  rates  into  line 
with  the  state  average,  but  by  1985-86  they  had  finally 
done  so. 

The  prostatectomy  example  in  Figure  8  shows  a  constant 
increase  in  rates  in  the  early  1980s.  At  this  point  the 
urology  group  began  an  outcome  study  on  the  basis  of  some 
disturbing  mortality  statistics  relating  to  prostatectomy. 
Approximately  three-fourths  of  the  state's  urologists 
agreed  to  voluntarily  enroll  their  patients  in  the  study, 
and  the  resulting  "sentinel"  effect  caused  a  rather 
dramatic  drop  in  the  rates.     They  are  still  going  down. 
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Will  these  rates  eventually  rise  again?  I  have  no  idea, 
but  I  would  agree  with  Dr.  Restuccia  that  one  must  continue 
to  monitor  and  provide  feedback. 


What  the  Program  Demonstrates 

We  have  shown  in  Maine,  to  our  satisfaction  and  I  think  to 
others'  as  well,  that  the  technique  of  data  feedback  and 
education  works.  It  has  earned  the  support  and  cooperation 
of  practicing  physicians  and  organized  medicine,  for 
several  reasons. 

Most  important,  the  process  in  Maine  is  controlled  by  the 
medical  community.  Physicians  are  in  charge,  and  we  have 
found  that  letting  physicians  do  their  own  job  and  take 
care  of  their  own  problems  is  very  effective  in  influencing 
rates  of  practice.  It  is  also  important  to  maintain  the 
confidentiality  of  the  study  group  activities;  that  avoids 
anxiety,  resentment,  and  hostility  among  practicing 
physicians.  The  result  is  that  once  physicians  become 
familiar  with  data,  they  recognize  that  high  variations  in 
practice  are  inappropriate,  and  they  are  anxious  to  study 
and  work  on  the  problem.  We  have  had  excellent  voluntary 
cooperation  from  physicians  across  the  state. 

Those  are  the  strengths.  There  are  weaknesses  as  well.  We 
do  not  measure  quality,  because  we  are  not  sure  how  to  do 
it.  We  do  know,  however,  that  it  is  very  difficult  and 
very  expensive.  The  studies  we  do  now  are  possible  at 
little  cost.  When  one  starts  looking  at  individual  patient 
records  the  cost  goes  up. 

We  have  seen  that  one  can  level  rates  very  quickly.  We 
have  also  seen  that  they  may  go  back  up  once  monitoring  and 
feedback  end.  Why?  Largely  because,  when  there  is  lack  of 
consensus  about  a  procedure,  physicians  will  revert  to 
whatever  pattern  is  most  comfortable  for  them.  Our  project 
does  not  necessarily  produce  consensus.  It  influences 
rates,  but  the  effect  may  be  temporary.  In  fact,  I  could 
show  other  areas  of  Maine  where  the  laminectomy  rates  are 
starting  to  slide  up,  while  they  have  dropped  in  the  areas 
under  direct  study. 

To  bring  about  permanent  change,  we  need  outcome  studies. 
The  urologists  have  just  completed  one,  and  it  is  a  fine 
study.  We  are  planning  to  do  another  on  laminectomy.  It 
will  be  a  patient-oriented  study  focusing  on  quality  of 
life,  and  we  hope  it  will  answer  questions  for  both 
patients  and  physicians  as  to  the  utility  of  this 
operation. 
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Conclusion 


What  is  effective — -education,  a  carrot,  or  a  stick?  I  can 
tell  you  that  a  stick  is  unwise.  It  may  beat  people  into 
submission,  but  it  does  not  produce  cooperation, 
credibility,  or  quality  of  care.  A  carrot  would  be  nice, 
but  we  have  seen  that  education  really  does  work. 
Education  is  the  hallmark  of  the  Maine  program,  which  has 
demonstrated  that  one  can,  by  working  with  doctors  in  an 
educational  format,  influence  the  way  they  practice. 


*        *  * 

Dr.  Robert  Keller  is  a  practicing  orthopedic  surgeon  in 
Maine,  and  the  leader  of  the  orthopedic  study  group  of  the 
Maine  Medical  Assessment  Project.  He  has  served  on  the 
faculty  of  both  the  Harvard  and  the  University  of 
Massachusetts  Medical  Schools. 
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FIGURE  t 
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FIGURE  5 

1980-1986  WORKER'S  COMP  LAMINECTOMIES 
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FIGURE  6 


MAINE  WORKER'S  COMP  LAMINECTOMIES 
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FEEDBACK  AND  OTHER  ISSUES 


Alternative  quality  of  care  measures 

*  The  ideal  quality  of  care  measurement  system 
is  based  on  random  assignment  of  patients  to 
providers.  This  type  of  system  is  actually 
in  place  in  some  areas,  and  it  could  easily 
be  installed  in  some  others. 

*  Another  approach  is  to  shift  the  burden  of 
proof  for  quality  issues  from  the  Health 
Care  Financing  Administration  (which  must 
now  prove  that  a  provider  is  bad)  to  the 
providers  (which  would  be  required  to  prove 
that  care  is  good) . 


By  Duncan  Neuhauser,  Ph.D. 

My  charge  was  to  talk  about  everything  that  has  not  been 
covered  elsewhere  in  the  Symposium — but  I  have  chosen 
another  role.  I  decided  to  look  at  how  one  would  construct 
a  quality  measurement  system  if  one  really  wanted  to  do  it 
right.  I  also  decided  that  this  ideal  system  must  not  cost 
the  Health  Care  Financing  Administration  any  money.  That 
was  my  challenge. 

The  correct  way  to  assess  quality  of  care,  in  my  view,  is 
to  use  random  assignment  of  patients  to  different  provider 
groups.     Let  me  tell  you  how  that  might  work. 

Requirements  for  Random  Assignment 

First,  there  ought  to  be  a  social  obligation  on  all  of  us 
to  participate  in  clinical  trials,  just  as  we  have  social 
obligations  to  pay  taxes,  drive  on  one  side  of  the  road, 
donate  blood  and  organs,  get  drafted,  and  get  vaccinated. 
To  put  this  concept  into  action,  employers  who  offer 
several  health  plans  should  induce  employees  who  are 
uncertain  about  which  plan  to  join  to  accept  random 
assignment,  and  various  rewards  could  be  offered  to 
employees  who  agree.  In  the  corporation  where  I  work, 
there  are  quite  a  few  employees  who  can't  make  up  their 
minds  about  the  choice  of  a  health  plan.  I  think  my 
employer  and  these  undecided  employees  should  have  the 
obligation  to  flip  a  coin,  and  to  follow  up  what  occurs. 

Second,  if  I  may  distinguish  between  primary  care  providers 
and  downstream  providers  that  receive  patients  from  primary 
care  sources,    let  me  propose  a  requirement  for  licensure. 
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All  downstream  provider  organizations  should  be  obliged,  if 
asked,  to  accept  randomly  assigned  patients.  They  should 
further  be  obliged  to  undertake  an  honest  evaluation  of  the 
outcomes  for  patients  who  are  received  that  way,  and  to 
allow  an  independent  auditor  to  compare  their  outcomes  to 
others'.  This     should     be     a     simple     obligation  and 

requirement  for  licensure  and  accreditation. 

We  should  also  consider  the  key  role  of  what  I  would  call 
block  bookers.  These  are  usually  physicians,  and  they  are 
usually  with  primary  care  capitated  plans  or  Preferred 
Provider  Organizations  that  negotiate  for  patient  days 
downstream.  All  too  often  block  bookers  are  lazy?  they  are 
not  careful  in  the  choice  of  their  downstream  providers.  I 
think  block  bookers  are  so  important  in  structuring  the 
evaluation  of  care  that  they  should  be  certified,  and  they 
should  be  educated  in  the  systematic  and  careful  assignment 
of  patients.  Certainly  they  should  be  required,  when  in 
doubt,  to  send  one  patient  this  way  and  one  patient  that 
way. 

Examples  of  Random  Assignment 

I  hear  you  saying  to  yourselves  that  this  is  all  well  and 
good,  but  in  no  way  is  it  practical.  My  answer  to  you  is: 
you ' re  wrong . 

Currently  at  two  hospitals  in  Cleveland,  there  is  random 
assignment  of  patients  on  an  ongoing  basis  to  different 
units  within  the  hospitals.  Other  hospitals  have  randomly 
assigned  patients  to  different  home  health  care  providers. 
Additional  providers  are  gearing  up  to  do  this  in  other 
settings,  and  still  others  could  do  it  easily  if  they 
chose.  Let  me  give  you  two  examples,  both  from  outside 
this  country. 

One  possibility  that  intrigues  me  comes  from  Reykjavik, 
Iceland.  The  city  has  three  hospitals.  On  the  first  day, 
all  emergency  cases  go  to  Hospital  1,  on  the  second  day  to 
Hospital  2,  and  on  the  third  day  to  Hospital  3.  I  propose 
to  you  that  there  is  a  fairly  easy  way  to  evaluate  quality 
of  care  in  those  three  institutions,  if  they  chose  to  do 
it. 

The  second  example  comes  from  Israel.  I  was  talking 
recently  with  someone  who  is  going  to  become  a  senior 
manager  in  Kupat  Holim  Health  Service,  a  large  organization 
that  is  similar  to  the  Veterans  Administration  or  Kaiser 
Permanente  here.  Kupat  Holim  has  about  150  primary  care 
clinics  across  the  country.  They  could  collect  information 
about  these  clinics— and,  when  they  intervene  in  a 
management      way,      randomly      choose      some      clinics  for 
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intervention  and  leave  others  as  controls.  In  this 
example,  randomization  occurs  not  at  the  patient  level,  but 
at  the  management-to-provider  level  in  order  to  monitor 
systematic  changes.  I  would  propose  that  as  the  correct 
way  to  manage  a  large-scale  system,  whether  it  is 
McDonalds,  or  Kaiser  Permanente,  or  Kupat  Holim. 

Changing  the  Burden  of  Proof 

While  we  wait  for  perfection  to  occur,  there  are  some 
things  we  can  do.  One  area  I  would  concern  myself  with 
involves  changing  the  burden  of  proof.  At  present  the 
burden  lies  with  HCFA,  which  must  show  that  a  hospital  or 
other  provider  is  bad.  I  would  shift  that  burden  to  the 
providers,  who  would  be  required  to  come  forth  and  show 
that  they  are  good.  I  will  give  you  a  hypothetical  example 
of  how  it  might  be  accomplished — not  that  this  is 
necessarily  the  right  or  only  way  to  do  it.  I  merely 
present  it  as  something  to  think  about. 

In  order  for  providers  in  a  particular  region  to  receive 
Medicare  reimbursement,  they  must  agree  to  divide 
themselves  into  three  groups:  one-third  that  provides 
above-average  quality  of  care,  one-third  that  provides 
average  care,  and  one-third  that  is  below  average.  Those 
in  the  first  group  will  receive  50  cents  more  per  patient 
day  in  reimbursement,  those  in  the  second  will  receive  the 
same  amount  as  at  present,  and  those  in  the  third  will 
receive  50  cents  less  per  day.  The  money  is  clearly  not  an 
issue,  but  the  implications  for  marketing  and  public 
perception  of  quality  are  enormously  powerful.  If  a  region 
does  not  agree  to  this  arrangement,  all  of  its  providers 
would  be  penalized  50  cents  per  patient  day.  I  call  this 
the  Quality  of  Care  Consultants'  Full  Employment  Act. 

Final  Thoughts 

I  have  a  few  more  comments.  Looking  at  quality  of  care  is 
not  new.  It  has  been  a  problem  for  a  very  long  time.  One 
of  my  favorite  examples  comes  from  the  earliest  days  of 
Boston.  In  the  very  year  the  city  was  founded,  1630,  there 
was  a  court  case  involving  a  man  named  Nicholas  Knopp.  He 
was  fined  five  pounds  or  whipped — I  guess  he  had  his 
choice — for  selling  a  cure  for  scurvy  "of  no  worth  or 
value".  I  would  like  to  know  how  the  court  decided  the 
cure  was  of  no  worth  or  value.  So,  from  at  least  1630  to 
the  present,  quality  of  care  has  been  a  major  issue.  It 
hasn't  gone  away  and  I  doubt  it  ever  will. 

Another  comment:  we  have  been  hearing  about  physicians  all 
day  long.  We  at  least  ought  to  take  our  hats  off  to  the 
nurses,  who  were  involved  in  the  best  quality  of  care  study 
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ever  known.  That  is,  of  course,  the  one  Florence 
Nightingale  conducted  at  Scutari  Hospital  during  the 
Crimean  War.  She  had  the  good  fortune — or  the  soldiers  had 
the  misfortune — to  come  into  a  situation  where  432  of  every 
1,000  patients  died.  When  she  left,  22  of  every  1,000 
died.  If  we  had  mortality  differences  of  432  to  22,  my 
guess  is  that  we  would  be  talking  today  about  a  very 
different  set  of  problems. 

And  a  final  comment:  the  measure  of  a  well-functioning 
health  care  system  is  that  you  cannot  measure  differences 
in  quality  of  care. 


*        *  * 

Duncan  Neuhauser,  who  holds  a  Ph.D.  in  Business 
Administration,  is  Professor  of  Epidemiology  and 
Biostatistics,  and  Keck  Foundation 

Senior  Research     Scholar     at     Case     Western  Reserve 

University,  where  he  also  serves  as  Co-Director  of  the 
Health  Systems  Management  Center.  He  is  a  consultant  to 
the  Cleveland  Metropolitan  General  Hospital  and  Cleveland 
Clinic,  a  member  of  the  Institute  of  Medicine,  and  Editor 
of  Medical  Care- 
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WORKSHOP  I 


Panel  on  epidemiologic  monitoring,  data  analysis,  and 
feedback 

Kenneth  Manton,  Ph.D.,  leader;  Robert  Keller,  M.D.; 
Joseph  Restuccia,  Ph.D.;  Gary  Gaumer,  Ph.D. 

Discussion  summarized  by  Dr.  Manton. 


As  we  were  the  epidemiologic  group,  the  underlying  theme 
was  a  systematic  approach  to  linking  data  bases  in  order  to 
look  at  population  health  characteristics  and  the  evolution 
of  service  use  on  an  individual  level.  We  were  especially 
interested  in  linking  Medicare  data  with  other  national 
studies  of  select  populations.  It  should  be  possible  to 
get  additional  mileage  from  looking  at  individual  health 
changes  and  the  impact  of  service  use  on  those  changes  in 
the  longer-term  perspective  of  studies  across  service 
types.  And  it  should  be  possible  to  do  that  cost- 
effectively  by  linking  existing  data  sets  to  build 
histories  of  service  use. 

We  talked  about  cohort  studies.  There  is  a  desire  to 
anticipate  the  Medicare  population's  changing  patterns  of 
need,  so  that  one  can  make  evolutionary  changes  in  service 
packages.  Anticipatory  studies  of  younger  populations, 
such  as  the  55-64  age  group,  would  be  one  way  to  accomplish 
this  end. 

We  also  talked  about  how  changing  mortality  relates  to 
underlying  changes  in  morbidity  and  disability  patterns. 
There  is  the  question  of  active  life  expectancy:  longer 
life  can  simply  mean  a  longer  period  of  disability,  and 
thus  a  greater  need  for  long-term  care.  This  brought  up 
the  question  of  measuring  disability.  Everyone  agrees 
it  is  important,  and  there  is  evidence  that  whatever 
disability  measure  you  use  will  be  predictive  of  such 
outcomes  as  mortality  and  subsequent  service  use.  But 
there  is  always  discomfort  about  choosing  a  disability 
measure;  all  of  them  somehow  feel  very  soft. 

A  related  question  is  whether  it  is  more  appropriate  to 
think  in  terms  of  primary  intervention  than  intervention 
aimed  at  reducing  risk  factors.  The  Leland  Report  from 
Canada  found  that  perhaps  85%  of  all  health  problems  were 
exogenous  to  the  health  care  system.  What  can  we  do  to 
change  those  background  factors?  Can  we  somehow  link 
disability  and  morbidity  in  terms  of  targeting  service  uses 
and  interventions  to  create  a  quality-adjusted  improvement 
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in  life  expectancy  and  living? 

On  another  topic,  several  group  members  pointed  out  that  we 
have  talked  so  far  only  about  quality  assurance  in  terms  of 
a  minimal  standard,  yet  there  are  positive  measures  of 
quality  as  well.  We  should  think  about  developing 
incentives  or  feedback  that  would  improve  quality  of 
service  delivery  and  create  positive  behavior,  whether 
through  a  reimbursement  mechanism  or  through  the  voluntary 
organizations  we've  seen  in  local  area  studies.  But  we  do 
need  to  improve  quality  at  the  high  end  in  addition  to 
working  on  problems  at  the  low  end  of  the  spectrum. 

There  were  questions  on  losing  populations  and  data  systems 
for  people  in  capitated  systems.  Individuals  who  enter 
capitated  systems  fall  into  a  black  hole  where  they  cannot 
be  observed.  What  can  one  do  to  gather  basic  service  use 
data  on  these  people?  Should  there  be  a  minimal  data  base 
that  continues  to  track  them  as  individuals  while 
they  are  in  capitated  systems?  We  found  no  resolution,  but 
we  did  identify  this  is  a  potential  problem  that  needs  more 
investigation  and  research. 

Data  linkages  were  discussed  in  detail.  We  talked  about 
cause-specific  mortality  systems  and  measures  of  the 
incidence  and  prevalence  of  both  disease  and  disability  in 
the  population,  and  whether  linkages  with  such  systems 
could  be  built  into  the  Medicare  files.  We  also  talked 
about  linking  Medicare  and  Medicaid.  There  was  particular 
interest  in  the  longitudinal  follow-up  of  individuals  and 
the  capacity  to  track  the  evolution  of  service  use  for 
various  cohorts,  rather  than  the  cross-sectional,  period- 
based  analyses  that  aggregate  evaluations  tend  to  produce. 

Finally,  group  members  expressed  frustration  at  not  being 
able  to  pursue  all  of  these  issues  further.  Many 
recommended  a  follow-up  conference — or,  if  not  a 
conference,  a  smaller  and  more  manageable  working  group. 
If  HCFA  couldn't  do  this  alone  (the  agency  being  somewhat 
"conferenced-out"  at  the  moment,  as  several  staff  members 
pointed  out) ,  perhaps  there  could  be  a  collaboration 
between  HCFA  and  CDC,  or  a  consortium  of  groups  to  sponsor 
an  ongoing  effort.  This  effort  is  definitely  needed,  as  we 
have  thus  far  only  scratched  the  surface. 
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WORKSHOP  II 


Panel  on  hospital  quality  of  care  outcome  data  development 
and  collection 

Kathleen  Lohr,  Ph.D.  (Institute  of  Medicine,  National 
Academy  of  Sciences),  leader;  Douglas  Wagner,  Ph.D.; 
James  Prevost,  M.D.;  Duncan  Neuhauser,  Ph.D.;  Kathleen 
Griffin,  Ph.D.  (American  College  of  Health  Care 
Administrators) 

Discussion  summarized  by  Dr.  Lohr. 


One  important  point  that  emerged  from  our  discussions  is 
that,  for  the  questions  we  were  asked  to  look  at- — 
measuring  the  quality  of  care  in  hospitals,  thinking  about 
outcomes  and  about  the  data  sets  that  permit  outcome 
measurement- — we  live  with  a  perpetual  tension  between  two 
approaches.  We  must  be  willing  to  be  patient  about 
developing  broad  strategies  for  quality  assurance,  because 
this  takes  time.  On  the  other  hand,  clearly  some  things 
must  be  dealt  with  right  away.  Certainly  that  includes 
coming  to  grips  with  defining  outcomes  such  as  hospital- 
specific  mortality  rates  and  finding  ways  to  make  this 
information  available  to  consumers  and  purchasers. 

Another  point  that  emerged  is  the  enormous  number  and  range 
of  quality  assurance  systems  now  being  developed  in 
hospital  systems  nationwide.  We  heard  from  representatives 
of  the  Department  of  Defense,  the  Veterans  Administration, 
the  Federation  of  American  Health  Systems,  two  different 
Catholic  hospital  systems,  Blue  Cross/  Blue  Shield 
(operating  on  behalf  of  their  major  national  clients) ,  and, 
of  course,  from  HCFA.  I  think  we  all  had  our  eyes  opened 
as  to  the  extent  and  creativity  of  this  activity,  not  to 
mention  the  differences  that  exist  in  kinds  of  data  these 
systems  collect  and  their  methods  for  feeding  information 
back  to  their  providers.  In  fact,  one  of  our  group's  clear 
recommendations  is  that  we  need  to  know  more  about  these  QA 
systems — not  in  order  to  evaluate  them,  but  simply  to  get 
descriptive,  comparative  information  on  their  purposes, 
their  data  systems,  and  their  methods.  The  group  felt 
strongly  that  we  would  all  benefit  from  knowing  what 
everyone  else  is  doing,  and  one  proposal  was  that  an 
"inventory"  of  QA  systems  be  compiled. 

In    thinking    about    developing    practical    data    sets  for 
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hospital  QA,  one  question  was  raised  repeatedly:  how  are 
these  data  going  to  be  used?  The  group  concluded  that 
variables,  measures,  and  data  elements  cannot  sensibly  be 
defined  if  the  ultimate  purposes  they  will  serve  are  not 
known.  For  example,  efforts  might  be  focused  on  finding 
poor  quality  of  care  and  trying  to  improve  it,  but 
alternatively  one  might  work  from  the  broader  perspective 
of  measuring  true  overall  levels  of  quality.  These 
different  purposes  imply  different  sets  of  measures  and 
variables . 

The  underlying  question  of  developing  hospital  QA  data  sets 
led  to  a  discussion  about  the  reliability  of  data.  Serious 
questions  persist  about  the  reliability  of  diagnosis — 
although  we  agreed  that  the  central ity  of  correct  diagnosis 
for  reimbursement  has  brought  an  improvement.  Substantial 
questions  also  remain  about  the  accuracy  of  such  elements 
as  discharge  status  data,  and  this  leads  to  concerns  about 
the  confidence  that  we  can  have  about  hospital-specific 
mortality  rates—especially  when  it  comes  to  making  that 
information  public.  We  need  to  do  a  great  deal  more  work 
on  determining  the  reliability  of  data  and  developing 
procedures  to  improve  it,  including  better  "charting 
practices" . 

Another  point  that  was  stressed  concerned  standardization 
and  uniformity  of  data  bases — ideals  that  are  honored  in 
the  breach  at  the  moment,  not  in  the  observance.  A  good 
deal  of  discussion  ensued  about  lack  of  uniformity  of 
medical  records,  lack  of  sequentiality  (i.e.,  continuity  of 
records  across  care  settings) ,  problems  of  confidentiality, 
and  the  need  for  commonly  understood  definitions  and 
heuristics. 

Perhaps  the  most  critical  issue  the  workshop  raised 
concerned  feedback.  The  group's  members  saw  feedback  as 
one  of  the  driving  forces  behind  improving  the  reliability 
of  data,  as  well  as  improving  the  quality  of  care  delivered 
by  hospitals,  physicians,  and  nurses.  We  agreed  it  is 
necessary  to  learn  much  more  about  feedback  mechanisms  and 
how  to  institute  them  effectively  and  efficiently.  It  is 
also  necessary  to  get  clinical  information  back  to  the 
clinical  level;  physicians  cannot  use  aggregate  death  rate 
statistics  to  improve  the  quality  of  care  they  individually 
provide . 

We  were  also  asked  to  consider  outcomes.  The  group 
believed  that  outcomes,  as  we  measure  them  today  with  the 
data  we  have  available,  are  at  best  a  crude  signal  that 
something  may  be  wrong.  Some  felt  that  outcomes  should  be 
more  than  a  screening  tool ;  they  should  be  THE  measure  of 
quality    of    care.       Several    people    reminded    us,  however, 
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that  the  real  issues  for  the  elderly  are  chronic  diseases 
and  medical  problems,  and  outcomes  of  the  type  most 
commonly  talked  about  these  days,  such  as  inpatient 
mortality  and  readmission,  do  not  tell  us  much 
about  the  end  results  of  care  for  these  conditions.  We 
were  also  reminded  about  the  wide  range  of  patient  values 
and  preferences  that  must  be  considered  in  any  broad 
definition  of  quality  of  care.  The  group  agreed  that  these 
are  crucial  concepts,  yet  recognized  that  we  have  very  few 
specific  ways  of  measuring  them. 

Related  to  both  patient  preferences  and  outcomes  is  the 
question  of  information  disclosure.  What  do  patients  and 
purchasers  really  want  to  know?  Are  we  sure  we're  on  the 
right  track  when  we  focus  on  a  measure  like  hospital- 
specific  mortality  rates?  Or  would  patients  rather  know, 
for  instance,  how  easy  it  is  to  reach  the  hospital,  or  how 
willing  their  providers  are  to  answer  questions  (especially 
in  off  hours)? 

A  further  set  of  comments  concerned  the  link,  between 
process  and  outcome  and  the  need  for  considerable  research 
to  strengthen  it.  Some  members  of  the  group  felt  strongly 
that  HCFA  does  have  a  major  role  to  play  in  what  can  almost 
be  called  medical  technology  assessment.  We  need  to 
understand  more  about  medical  practice  and  its 
effectiveness,  so  that  we  are  in  a  better  position  to 
develop  criteria  for  appropri-  ate  care.  One  member 
commented  that  this  is  crucial  to  everything  we  talk  about 
in  quality  assurance,  but  it  is  in  nobody's  bailiwick; 
medical  practice  evaluation  and  investigation  of 
effectiveness  of  care  belong  in  many  places,  and  they 
definitely  belong  in  HCFA. 

WORKSHOP  III 

Panel  on  quality  of  care  monitoring  in  capitated  systems 

Robert  Brook,  M.D.,  D.  Sc.,  leader;  Mark  Blumberg, 
M.D. ;  Sheldon  Retchin,  M.D.;  Ann  Flood,  Ph.D. 

Discussion  summarized  by  Dr.  Brook. 


Our  group  found  itself  reiterating  one  policy  issue:  that 
formulating  topics  about  capitated  care  must  be  done  very 
carefully  to  ensure  the  playing  field  is  even.  Since  this 
did  not  seem  especially  productive,  we  quickly  changed  our 
mission.       We    set    ourselves    the    task   of   determining  the 
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best,  least  biased  quality  indicators  for  comparing  the 
fee-for-service  system  and  the  HMO  system. 


We  conducted  a  formal  brainstorming  session,  going  around 
the  room  and  asking  each  member  of  the  group  to  list 
specific  indicators  that  would  serve  the  mission  of  the 
question.  We  then  rated  those  indicators  on  a  combination 
of  importance  and  feasibility  over  the  next  three  years, 
dividing  them  into  high,  medium,  and  low  priorities. 

The  list  is  shown  below.  It  is  encouraging  that  a  number 
of  the  items  on  it  can  be  implemented  right  now,  should  one 
choose  to  do  so. 


Data  Issues  with  Capitated  Systems 

1.  Data  requested 

2.  Self-selection 

3 .  Access 

4.  Sampling 

5.  Uniform  outpatient  and  inpatient  encounter  form 

6.  Unique  patient  and  provider  identifiers 

7.  Structure/ financial  incentives 


Ambulatory  Care 
Priority  Indicator 

High  Stage  at  time  of  admission 


High 
High 
High 
Medium 

Medium 
Medium 


Tracers  of  under-  and  over- 
utilization 

Preventable  incidence  and 
death  (e.g.,  stroke  rate) 


Access  to  high 
technology 


Functional  status ,  perceived 
health  status  (ADL,  IADL, 
mental,  physiological) 

Follow-up  of  abnormal  tests 

State  of  disease  at  diagnosis 


Source 

Face  sheet,  hospital  record, 
or  tumor  registry 

Medical  records 


Death     certificate,  medical 

records 


cost      Medical  records,  patient 


New  data  collection,  acute 
and  chronic  diagnosis, 
adequacy  of  care 

Computerized  lab  data 

Cancer  tumor  register 


17-6 


Medium  Mortality 


Medium        Disability      (loss     of  work 
days) 

Medium       Ability  of  system  to  identify 
and  treat  substance  abuse 

Medium        Symptom-based  tracers 


Low  Gatekeeping:       referral  non- 

clinical access  criteria 

Low  Out-of-plan  use 

Low  Last  year  of  life  technology 

utilization  appropriateness 


Population-based ; 
severity  and  health 
status  adjusted 

Employment  survey  data, 
work  data 

Medical  records 


Medical  records,  new 
data  from  patients 


Research 


Unranked  indicators:  DNR  notation,  pregnancy  process 
criteria,  family  planning,  time  from  presentation  to 
diagnosis  to  treatment,  ambulatory  surgery  (procedures  by 
site,  follow-up,  infection  rates) 


Hospital  Care 
Priority  Indicator 


High 
High 

Low 


Premature  hospital  discharge 
(case-mix  adjustment  loss) 

Unplanned  readmissions 
(diagnosis-specific,  risk- 
adjusted) 

Access  through  ER 


Source 


Stability  at  discharge 


Medical  records; 
hospital  data  systems 


Unranked  indicators: 
of -plan     care,  pre- 
perinatal  mortality 
mortality  statistics 


Untoward  events,   transfers  from  out- 
and     post-surgical  complications, 
(adjusted     for    birthweight) ,  state 
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Other  Care 
Priority  indicator 
High 


High 

High 
Medium 
Medium 
Medium 


Low 


Low 


Community-oriented  primary 
prevention  (preventable 
cancer,  immunization  status) 

Disenrollee  characteristics 


Quality  of  medical  records 

Patient  satisfaction 

Quality  of  telephone  care 

Sample  implicit  death  review 
(humaneness  of  death  and 
dying  process  and  technical 
aspects) 

Continuity  and  coordination 
of  care 

Perceived  health  status  (wait 
time) 


Source 


Medical  records 


Patient  survey  of 
enrollees  and  disenrollees 


Simulation 


Part  B,  encounter  data 


Time  and  nation  surveys 


Unranked     indicators:  Alternate     care     use,  transfer 

summaries,  organizational  structure,  board  certification, 
licensure,  internal  QA  systems,  use/access  to  social 
workers  and  information  on  alternative  providers 


Home  Care 


Unranked  indicators:  Criteria  and  standards  for  acceptable 
family  care  burden  (family  loss  of  work)  ,  home  care  use, 
nursing  home  use,  quality  of  nursing  home  care 


Iatrogenic  Issues 

Medium  priority:  Multiple  drug  use.  Unranked  indicators: 
psychotropic  drug  use,  patient  education 
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WORKSHOP  IV 


Panel  on  the  development  of  a  quality  of  care  data  system 

Philip  Caper,  M.D.,  leader;  William  Munier,  M.D.? 
Christopher  Blagg,  M.D.;  Fred  Bodendorf,  M.D.;  Michael 
McMullen,  Office  of  Statistics  and  Data  Management, 
Health  Care  Financing  Administration 

Discussion  summarized  by  Dr.  Caper. 


We  had  a  wide-ranging  discussion,  due  in  part  to  our 
difficulty  in  defining  the  term  "quality".  I  commented, 
somewhat  belatedly,  that  I  didn't  like  the  word  "quality" 
in  the  first  place,  because  it  doesn't  mean  anything  to  me, 
other  than  that  it's  something  which  will  suffer  if  you 
place  any  constraints  whatsoever  on  the  medical  care 
system.  Aside  from  that,  we  directed  our  discussion  toward 
elements  that  need  to  be  better  identified  and  measured, 
such  as  morbidity  and  mortality,  functional  status,  and 
patient  satisfaction. 

We  began  with  a  presentation  from  Michael  McMullan  of  the 
Office  of  Statistics  and  Data  Management,  telling  us  what 
Medicare  collects,  what  is  available  to  the  public,  and  in 
what  form.  It  gave  us  an  excellent  overview  of  the  federal 
government's  activities.  (Editor's  note:  The  presentation 
appears  as  Appendix  I  to  this  volume.) 

Dr.  Christopher  Blagg  talked  about  the  ESRD  Program  data 
base,  which  he  presented  as  a  model  of  what  the  Medicare 
data  base  could  be  in  some  respects.  He  mentioned  the 
importance  of  having  some  diagnostic  information— 
specifically,  health  history  information  and  the  cause  of 
renal  failure — in  addition  to  hospital  discharge 
information,  and  he  suggested  this  could  add  a  powerful 
element  to  the  routinely-collected  Medicare  data  base. 

Dr.  William  B.  Munier  talked  about  the  frustration  of 
trying  to  construct  quality  of  care  data  bases  from  sources 
that  lack  uniformity.  There  is,  he  said,  a  desire  to 
measure  quality  from  information  contained  on  the  Medicare 
bill — which  is  not  possible — and  he  emphas-  ized  two 
points.  First,  data  necessary  for  quality  measurement  must 
be  defined  on  a  problem-  or  diagnosis-specific  basis. 
Second,  the  collection  of  data  alone  is  insufficient  to 
make  quality  judgments.  Medical  logic  must  be  applied  to 
data  collected  in  order  to  translate  the  data  into  usable 
information. 
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Dr.  Fred  Bodendorf  outlined  the  new  law  in  Pennsylvania, 
where  systems  to  monitor  quality  of  care  are  still  being 
designed.  He  suggested  that  Medicare  add  one  data  element 
which  Pennsylvania  has  found  to  be  useful:  a  severity  of 
illness  measure  for  hospital  care. 

In  addition,  there  were  suggestions  from  members  of  the 
group  that  we  add  information  to  the  Medicare  data  base 
about  functional  status,  especially  for  nursing  home 
patients.  One  participant  suggested  very  strongly  that 
Medicare  develop  an  iterative  process  to  help  determine  on 
an  ongoing  basis  what  data  elements  to  collect  and  how  to 
use  them.  It  was  also  suggested  that,  as  the  PROs  are  now 
making  major  efforts  in  data  collection  and  analysis, 
Medicare  should  plug  the  PRO  system  into  its  own  data 
collection  process  as  a  feedback  loop. 

On  the  other  hand,  it  was  pointed  out  that  the  federal 
government's  ability  to  collect  and  disseminate  data  is 
constrained  by  budgetary  limitations.  The  Medicare  data 
collection  system  was  not  intended  to  measure  the 
performance  of  the  program;  it  was  intended  to  pay  bills. 
The  functions  and  expectations  of  the  Medicare  system  have 
been  evolving  since  1965  when  the  law  was  enacted,  and  this 
conference  is  part  of  that  evolutionary  process. 

Part  of  our  mandate  was  to  consider  a  series  of  specific 
questions,  primarily  dealing  with  the  role  of  geographic 
variations  in  medical  practice.  The  first  question  had  to 
do  with  interpreting  the  ranges  in  admission  rates  among 
defined  populations:  are  the  lowest  admission  rates  best, 
or  should  the  average  or  norm  be  considered  the  appropriate 
rate?  The  consensus  was  that  the  lowest  rates  are  not 
necessarily  best,  and  that  the  averages  are  just 
statistical  concepts;  they  have  no  clinical  meaning.  They 
are  useful  in  comparing  things,  but  not  in  determining 
which  rate  should  be  the  right  rate.  On  the  other  hand, 
geographic  variations  can  be  useful  as  a  benchmark  for 
quality. 

We  were  also  asked  what  factors  are  needed  to  adjust  the 
"raw"  admissions  rates.  Of  course,  most  of  the  geographic 
variations  information  is  already  corrected  for  age  and 
sex.  You  may  wish  to  correct  for  race  if  you  think  that's 
a  reliable  element,  although  doing  so  would  probably  wash 
out  many  of  the  differences  you  would  want  to  know  about. 
It  was  pointed  out  that  the  population  definition  will 
determine  what  differences  you  find.  Much  of  the 
variations  information  has  been  organized  around  individual 
hospitals  or  small  groups  of  hospitals,  as  a  way  of 
assessing  the  influence  of  medical  decision-making  and  the 
effects  of  differences   in  clinical  criteria  on  admissions 
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rates.  If  you  wanted  to  assess  the  effect  of  differences 
in  income  or  race,  you  would  simply  organize  the 
populations  differently. 

A  third  question:  as  we  move  toward  increased  utilization 
of  managed  care  and  capitation,  does  geographic  variation 
take  on  a  new  light?  Even  though  average  rates, 
particularly  hospitalization  rates,  are  lower  in  managed 
care  systems  than  in  the  fee-f or-service  system,  the 
variations  persist.  So  the  utility  of  looking  at 
variations  as  a  way  of  studying  differences  among 
physicians  and  the  way  they  use  resources  is  retained,  even 
though  the  averages  are  lower. 

Finally,  the  group  found  it  interesting  to  note  that,  as 
overall  utilization  rates  have  gone  down  in  the  past  few 
years,  the  rate  of  decrease  has  been  just  about  the  same  in 
the  low-use  areas  as  in  the  high-use  areas.  We  don't  know 
quite  what  to  make  of  that,  except  to  conclude  that  there 
is  probably  still  plenty  of  discretionary  utilization  left 
in  the  system. 
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